* [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups
@ 2025-04-21 18:51 Ingo Molnar
2025-04-21 18:51 ` [PATCH 01/29] x86/boot/e820: Remove inverted boolean logic from the e820_nomerge() function name, rename it to e820_type_mergeable() Ingo Molnar
` (29 more replies)
0 siblings, 30 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar
So I was looking into a E820 table related bug report by
Paul Menzel, and I wanted to implement the behavior
suggested by the ACPI specification, which bug/problem
results in unbootable Linux systems with certain bootloaders:
https://lore.kernel.org/r/074c2637-1b65-428e-b3e2-24384780e936@molgen.mpg.de
One thing led to another, and now I'm here 29 patches later,
trying to explain what they all do. :-/
In order of importance:
- The bugfix / change of Linux kernel E820 table parsing behavior:
x86/boot/e820: Treat non-type-2 'reserved' E820 region types as E820_TYPE_RESERVED
- A change to e820_search_gap() to fix an implementational oddity
that would prefer lower-address same-size PCI gaps over larger-address
PCI gaps. Now the implementation searches for the largest gap:
x86/boot/e820: Change e820_search_gap() to search for the highest-address PCI gap
- A rewrite of e820_search_gap() to search the E820 table in ascending
order to make sure even weird PCI holes get found, as claimed by the
comments around the code (but not properly implemented):
x86/boot/e820: Make sure e820_search_gap() finds all gaps
- Remove the exclusion of single-entry E820 tables passed in by
firmware:
x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables
- A debuggability improvement, to print the sizes of the e820 entries
and the holes as well, because parsing raw hexadecimal ranges is hard
for humans:
x86/boot/e820: Print gaps in the E820 table
x86/boot/e820: Print out sizes of E820 memory ranges
Before:
BIOS-provided physical RAM map:
BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
After:
BIOS-provided physical RAM map:
BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] 639 KB kernel usable RAM
BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] 1 KB device reserved
BIOS-e820: [gap 0x00000000000a0000-0x00000000000effff] 320 KB ...
BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] 64 KB device reserved
BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] 144 KB device reserved
BIOS-e820: [gap 0x0000000080000000-0x00000000afffffff] 768 MB ...
BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] 256 MB device reserved
BIOS-e820: [gap 0x00000000c0000000-0x00000000fed1bfff] 1005.1 MB ...
BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] 16 KB device reserved
BIOS-e820: [gap 0x00000000fed20000-0x00000000feffbfff] 2.8 MB ...
BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] 16 KB device reserved
BIOS-e820: [gap 0x00000000ff000000-0x00000000fffbffff] 15.7 MB ...
BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] 256 KB device reserved
BIOS-e820: [gap 0x0000000100000000-0x000000fcffffffff] 1008 GB ...
BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] 12 GB device reserved
Note how weirdly broken up ranges are printed with fractional size
values, while 'round' ranges are printed as natural numbers.
- Assorted cleanups: type cleanups, simplifications, standardization
of coding patterns, etc.
The tree can be found in the WIP.x86/e820 branch of the mingo/tip.git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/mingo/tip.git WIP.x86/e820
It's lightly tested at the moment.
Thanks,
Ingo
===============>
Ingo Molnar (29):
x86/boot/e820: Remove inverted boolean logic from the e820_nomerge() function name, rename it to e820_type_mergeable()
x86/boot/e820: Simplify e820__print_table() a bit
x86/boot/e820: Simplify the PPro Erratum #50 workaround
x86/boot/e820: Mark e820__print_table() static
x86/boot/e820: Print gaps in the E820 table
x86/boot/e820: Make the field separator space character part of e820_print_type()
x86/boot/e820: Print out sizes of E820 memory ranges
x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries
x86/boot/e820: Call the PCI gap a 'gap' in the boot log printout
x86/boot/e820: Use 'u64' consistently instead of 'unsigned long long'
x86/boot/e820: Remove pointless early_panic() indirection
x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations
x86/boot/e820: Improve e820_print_type() messages
x86/boot/e820: Clean up __e820__range_add() a bit
x86/boot/e820: Clean up __refdata use a bit
x86/boot/e820: Remove unnecessary header inclusions
x86/boot/e820: Standardize e820 table index variable names under 'idx'
x86/boot/e820: Change struct e820_table::nr_entries type from __u32 to u32
x86/boot/e820: Standardize e820 table index variable types under 'u32'
x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit
x86/boot/e820: Change e820_search_gap() to search for the highest-address PCI gap
x86/boot/e820: Rename gap_start/gap_size to max_gap_start/max_gap_start in e820_search_gap() et al
x86/boot/e820: Simplify & clarify __e820__range_add() a bit
x86/boot/e820: Standardize __init/__initdata tag placement
x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables
x86/boot/e820: Remove e820__range_remove()'s unused return parameter
x86/boot/e820: Simplify the e820__range_remove() API
x86/boot/e820: Make sure e820_search_gap() finds all gaps
x86/boot/e820: Treat non-type-2 'reserved' E820 region types as E820_TYPE_RESERVED
arch/x86/include/asm/e820/api.h | 3 +-
arch/x86/include/asm/e820/types.h | 2 +-
arch/x86/kernel/e820.c | 518 ++++++++++++++++++++++----------------
arch/x86/kernel/setup.c | 10 +-
arch/x86/platform/efi/efi.c | 3 +-
5 files changed, 307 insertions(+), 229 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH 01/29] x86/boot/e820: Remove inverted boolean logic from the e820_nomerge() function name, rename it to e820_type_mergeable()
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 02/29] x86/boot/e820: Simplify e820__print_table() a bit Ingo Molnar
` (28 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
It's a bad practice to put inverted logic into function names,
flip it back and rename it to e820_type_mergeable().
Add/update a few comments about this function while at it.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 9d8dd8deb2a7..4a81f9e94137 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -304,18 +304,22 @@ static int __init cpcompare(const void *a, const void *b)
return (ap->addr != ap->entry->addr) - (bp->addr != bp->entry->addr);
}
-static bool e820_nomerge(enum e820_type type)
+/*
+ * Can two consecutive E820 entries of this same E820 type be merged?
+ */
+static bool e820_type_mergeable(enum e820_type type)
{
/*
* These types may indicate distinct platform ranges aligned to
- * numa node, protection domain, performance domain, or other
+ * NUMA node, protection domain, performance domain, or other
* boundaries. Do not merge them.
*/
if (type == E820_TYPE_PRAM)
- return true;
+ return false;
if (type == E820_TYPE_SOFT_RESERVED)
- return true;
- return false;
+ return false;
+
+ return true;
}
int __init e820__update_table(struct e820_table *table)
@@ -393,7 +397,7 @@ int __init e820__update_table(struct e820_table *table)
}
/* Continue building up new map based on this information: */
- if (current_type != last_type || e820_nomerge(current_type)) {
+ if (current_type != last_type || !e820_type_mergeable(current_type)) {
if (last_type) {
new_entries[new_nr_entries].size = change_point[chg_idx]->addr - last_addr;
/* Move forward only if the new size was non-zero: */
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 02/29] x86/boot/e820: Simplify e820__print_table() a bit
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
2025-04-21 18:51 ` [PATCH 01/29] x86/boot/e820: Remove inverted boolean logic from the e820_nomerge() function name, rename it to e820_type_mergeable() Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-22 5:56 ` Mike Rapoport
2025-04-21 18:51 ` [PATCH 03/29] x86/boot/e820: Simplify the PPro Erratum #50 workaround Ingo Molnar
` (27 subsequent siblings)
29 siblings, 1 reply; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
Introduce 'entry' for the current table entry and shorten
repetitious use of e820_table->entries[i].
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 4a81f9e94137..b1a30bca56cd 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -204,12 +204,14 @@ void __init e820__print_table(char *who)
int i;
for (i = 0; i < e820_table->nr_entries; i++) {
+ struct e820_entry *entry = e820_table->entries + i;
+
pr_info("%s: [mem %#018Lx-%#018Lx] ",
who,
- e820_table->entries[i].addr,
- e820_table->entries[i].addr + e820_table->entries[i].size - 1);
+ entry->addr,
+ entry->addr + entry->size-1);
- e820_print_type(e820_table->entries[i].type);
+ e820_print_type(entry->type);
pr_cont("\n");
}
}
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 03/29] x86/boot/e820: Simplify the PPro Erratum #50 workaround
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
2025-04-21 18:51 ` [PATCH 01/29] x86/boot/e820: Remove inverted boolean logic from the e820_nomerge() function name, rename it to e820_type_mergeable() Ingo Molnar
2025-04-21 18:51 ` [PATCH 02/29] x86/boot/e820: Simplify e820__print_table() a bit Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-22 6:29 ` Andy Shevchenko
2025-04-21 18:51 ` [PATCH 04/29] x86/boot/e820: Mark e820__print_table() static Ingo Molnar
` (26 subsequent siblings)
29 siblings, 1 reply; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
No need to print out the table - users won't really be able
to tell much from it anyway and the messages around this
erratum are unnecessarily obtuse.
Instead clearly inform the user that a 256 kB hole is being
punched in their memory map at the 1.75 GB physical address.
Not that there are many PPro users left. :-)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/setup.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 9d2a13b37833..83cc265140c0 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -988,11 +988,9 @@ void __init setup_arch(char **cmdline_p)
trim_bios_range();
#ifdef CONFIG_X86_32
if (ppro_with_ram_bug()) {
- e820__range_update(0x70000000ULL, 0x40000ULL, E820_TYPE_RAM,
- E820_TYPE_RESERVED);
+ pr_info("Applying PPro RAM bug workaround: punching 256 kB hole at 1.75 GB physical.\n");
+ e820__range_update(0x70000000ULL, 0x40000ULL, E820_TYPE_RAM, E820_TYPE_RESERVED);
e820__update_table(e820_table);
- printk(KERN_INFO "fixed physical RAM map:\n");
- e820__print_table("bad_ppro");
}
#else
early_gart_iommu_check();
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 04/29] x86/boot/e820: Mark e820__print_table() static
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (2 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 03/29] x86/boot/e820: Simplify the PPro Erratum #50 workaround Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table Ingo Molnar
` (25 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
There are no external users of this function left.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/include/asm/e820/api.h | 1 -
arch/x86/kernel/e820.c | 2 +-
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/e820/api.h b/arch/x86/include/asm/e820/api.h
index c83645d5b2a8..54427b77bc19 100644
--- a/arch/x86/include/asm/e820/api.h
+++ b/arch/x86/include/asm/e820/api.h
@@ -19,7 +19,6 @@ extern u64 e820__range_update(u64 start, u64 size, enum e820_type old_type, enu
extern u64 e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
extern u64 e820__range_update_table(struct e820_table *t, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
-extern void e820__print_table(char *who);
extern int e820__update_table(struct e820_table *table);
extern void e820__update_table_print(void);
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index b1a30bca56cd..4244b6d53fd0 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -199,7 +199,7 @@ static void __init e820_print_type(enum e820_type type)
}
}
-void __init e820__print_table(char *who)
+static void __init e820__print_table(const char *who)
{
int i;
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (3 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 04/29] x86/boot/e820: Mark e820__print_table() static Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-22 6:00 ` Mike Rapoport
2025-04-22 6:42 ` Andy Shevchenko
2025-04-21 18:51 ` [PATCH 06/29] x86/boot/e820: Make the field separator space character part of e820_print_type() Ingo Molnar
` (24 subsequent siblings)
29 siblings, 2 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
Gaps in the E820 table are not obvious at a glance and can
easily be overlooked.
Print out gaps in the E820 table:
Before:
BIOS-provided physical RAM map:
BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
After:
BIOS-provided physical RAM map:
BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
BIOS-e820: [gap 0x00000000000a0000-0x00000000000effff]
BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
BIOS-e820: [gap 0x0000000080000000-0x00000000afffffff]
BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
BIOS-e820: [gap 0x00000000c0000000-0x00000000fed1bfff]
BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
BIOS-e820: [gap 0x00000000fed20000-0x00000000feffbfff]
BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
BIOS-e820: [gap 0x00000000ff000000-0x00000000fffbffff]
BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
BIOS-e820: [gap 0x0000000100000000-0x000000fcffffffff]
BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
Also warn about badly ordered E820 table entries:
BUG: out of order E820 entry!
( this is printed before the entry is printed, so there's no need to
print any additional data with the warning. )
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 4244b6d53fd0..a8edfa375fbd 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -201,18 +201,32 @@ static void __init e820_print_type(enum e820_type type)
static void __init e820__print_table(const char *who)
{
+ u64 range_end_prev = 0;
int i;
for (i = 0; i < e820_table->nr_entries; i++) {
struct e820_entry *entry = e820_table->entries + i;
+ u64 range_start, range_end;
- pr_info("%s: [mem %#018Lx-%#018Lx] ",
- who,
- entry->addr,
- entry->addr + entry->size-1);
+ range_start = entry->addr;
+ range_end = entry->addr + entry->size;
+ /* Out of order E820 maps should not happen: */
+ if (range_start < range_end_prev)
+ pr_info("BUG: out of order E820 entry!\n");
+
+ if (range_start > range_end_prev) {
+ pr_info("%s: [gap %#018Lx-%#018Lx]\n",
+ who,
+ range_end_prev,
+ range_start-1);
+ }
+
+ pr_info("%s: [mem %#018Lx-%#018Lx] ", who, range_start, range_end-1);
e820_print_type(entry->type);
pr_cont("\n");
+
+ range_end_prev = range_end;
}
}
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 06/29] x86/boot/e820: Make the field separator space character part of e820_print_type()
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (4 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges Ingo Molnar
` (23 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
We are going to add more columns to the E820 table printout,
so make e820_print_type()'s field separator (space character)
part of the function itself.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index a8edfa375fbd..10bd10bd5672 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -187,15 +187,15 @@ void __init e820__range_add(u64 start, u64 size, enum e820_type type)
static void __init e820_print_type(enum e820_type type)
{
switch (type) {
- case E820_TYPE_RAM: pr_cont("usable"); break;
- case E820_TYPE_RESERVED: pr_cont("reserved"); break;
- case E820_TYPE_SOFT_RESERVED: pr_cont("soft reserved"); break;
- case E820_TYPE_ACPI: pr_cont("ACPI data"); break;
- case E820_TYPE_NVS: pr_cont("ACPI NVS"); break;
- case E820_TYPE_UNUSABLE: pr_cont("unusable"); break;
+ case E820_TYPE_RAM: pr_cont(" usable"); break;
+ case E820_TYPE_RESERVED: pr_cont(" reserved"); break;
+ case E820_TYPE_SOFT_RESERVED: pr_cont(" soft reserved"); break;
+ case E820_TYPE_ACPI: pr_cont(" ACPI data"); break;
+ case E820_TYPE_NVS: pr_cont(" ACPI NVS"); break;
+ case E820_TYPE_UNUSABLE: pr_cont(" unusable"); break;
case E820_TYPE_PMEM: /* Fall through: */
- case E820_TYPE_PRAM: pr_cont("persistent (type %u)", type); break;
- default: pr_cont("type %u", type); break;
+ case E820_TYPE_PRAM: pr_cont(" persistent (type %u)", type); break;
+ default: pr_cont(" type %u", type); break;
}
}
@@ -491,9 +491,9 @@ __e820__range_update(struct e820_table *table, u64 start, u64 size, enum e820_ty
size = ULLONG_MAX - start;
end = start + size;
- printk(KERN_DEBUG "e820: update [mem %#010Lx-%#010Lx] ", start, end - 1);
+ printk(KERN_DEBUG "e820: update [mem %#010Lx-%#010Lx]", start, end - 1);
e820_print_type(old_type);
- pr_cont(" ==> ");
+ pr_cont(" ==>");
e820_print_type(new_type);
pr_cont("\n");
@@ -568,7 +568,7 @@ u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool
size = ULLONG_MAX - start;
end = start + size;
- printk(KERN_DEBUG "e820: remove [mem %#010Lx-%#010Lx] ", start, end - 1);
+ printk(KERN_DEBUG "e820: remove [mem %#010Lx-%#010Lx]", start, end - 1);
if (check_type)
e820_print_type(old_type);
pr_cont("\n");
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (5 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 06/29] x86/boot/e820: Make the field separator space character part of e820_print_type() Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-22 6:38 ` Andy Shevchenko
2025-04-22 6:53 ` Mike Rapoport
2025-04-21 18:51 ` [PATCH 08/29] x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries Ingo Molnar
` (22 subsequent siblings)
29 siblings, 2 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
Before:
BIOS-provided physical RAM map:
BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
After:
BIOS-provided physical RAM map:
BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] 639 KB kernel usable RAM
BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] 1 KB reserved
BIOS-e820: [gap 0x00000000000a0000-0x00000000000effff] 320 KB ...
BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] 64 KB reserved
BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] 144 KB reserved
BIOS-e820: [gap 0x0000000080000000-0x00000000afffffff] 768 MB ...
BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] 256 MB reserved
BIOS-e820: [gap 0x00000000c0000000-0x00000000fed1bfff] 1005.1 MB ...
BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] 16 KB reserved
BIOS-e820: [gap 0x00000000fed20000-0x00000000feffbfff] 2.8 MB ...
BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] 16 KB reserved
BIOS-e820: [gap 0x00000000ff000000-0x00000000fffbffff] 15.7 MB ...
BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] 256 KB reserved
BIOS-e820: [gap 0x0000000100000000-0x000000fcffffffff] 1008 GB ...
BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] 12 GB reserved
Note how a 1-digit precision field is printed out if a range is
fractional in its largest-enclosing natural size unit.
So the "256 MB" and "12 GB" fields above denote exactly 256 MB and
12 GB regions, while "1.9 GB" signals the region's fractional nature
and it being just below 2GB.
Printing E820 maps with such details visualizes 'weird' ranges
at a glance, and gives users a better understanding of how
large the various ranges are, without having to perform hexadecimal
subtraction in their minds.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
(cherry picked from commit d1ac6b8718575a7ea2f0a1ff347835a8879df673)
---
arch/x86/kernel/e820.c | 47 +++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 45 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 10bd10bd5672..8ee89962fcbf 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -199,6 +199,41 @@ static void __init e820_print_type(enum e820_type type)
}
}
+/*
+ * Print out the size of a E820 region, in human-readable
+ * fashion, going from KB, MB, GB to TB units.
+ *
+ * Print out fractional sizes with a single digit of precision.
+ */
+static void e820_print_size(u64 size)
+{
+ if (size < SZ_1M) {
+ if (size & (SZ_1K-1))
+ pr_cont(" %4llu.%01llu KB", size/SZ_1K, 10*(size & (SZ_1K-1))/SZ_1K);
+ else
+ pr_cont(" %4llu KB", size/SZ_1K);
+ return;
+ }
+ if (size < SZ_1G) {
+ if (size & (SZ_1M-1))
+ pr_cont(" %4llu.%01llu MB", size/SZ_1M, 10*(size & (SZ_1M-1))/SZ_1M);
+ else
+ pr_cont(" %4llu MB", size/SZ_1M);
+ return;
+ }
+ if (size < SZ_1T) {
+ if (size & (SZ_1G-1))
+ pr_cont(" %4llu.%01llu GB", size/SZ_1G, 10*(size & (SZ_1G-1))/SZ_1G);
+ else
+ pr_cont(" %4llu GB", size/SZ_1G);
+ return;
+ }
+ if (size & (SZ_1T-1))
+ pr_cont(" %4llu.%01llu TB", size/SZ_1T, 10*(size & (SZ_1T-1))/SZ_1T);
+ else
+ pr_cont(" %4llu TB", size/SZ_1T);
+}
+
static void __init e820__print_table(const char *who)
{
u64 range_end_prev = 0;
@@ -215,14 +250,22 @@ static void __init e820__print_table(const char *who)
if (range_start < range_end_prev)
pr_info("BUG: out of order E820 entry!\n");
+ /* Print gaps, if any: */
if (range_start > range_end_prev) {
- pr_info("%s: [gap %#018Lx-%#018Lx]\n",
+ u64 gap_size = range_start - range_end_prev;
+
+ pr_info("%s: [gap %#018Lx-%#018Lx]",
who,
range_end_prev,
range_start-1);
+
+ e820_print_size(gap_size);
+ pr_cont(" ...\n");
}
- pr_info("%s: [mem %#018Lx-%#018Lx] ", who, range_start, range_end-1);
+ /* Print allocated ranges: */
+ pr_info("%s: [mem %#018Lx-%#018Lx]", who, range_start, range_end-1);
+ e820_print_size(entry->size);
e820_print_type(entry->type);
pr_cont("\n");
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 08/29] x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (6 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-22 6:31 ` Andy Shevchenko
2025-04-22 7:06 ` Mike Rapoport
2025-04-21 18:51 ` [PATCH 09/29] x86/boot/e820: Call the PCI gap a 'gap' in the boot log printout Ingo Molnar
` (21 subsequent siblings)
29 siblings, 2 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
So it is a bit weird that the actual RAM entries of the E820 table
are not actually called RAM, but 'usable':
BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB usable
'usable' is pretty passive-aggressive in that context and ambiguous,
most E820 entries denote 'usable' address ranges - reserved ranges
may be used by devices, or the platform.
Clarify and disambiguate this by making the boot log entry
explicitly say 'kernel usable RAM':
BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 8ee89962fcbf..99f997ae88dc 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -187,7 +187,7 @@ void __init e820__range_add(u64 start, u64 size, enum e820_type type)
static void __init e820_print_type(enum e820_type type)
{
switch (type) {
- case E820_TYPE_RAM: pr_cont(" usable"); break;
+ case E820_TYPE_RAM: pr_cont(" kernel usable RAM"); break;
case E820_TYPE_RESERVED: pr_cont(" reserved"); break;
case E820_TYPE_SOFT_RESERVED: pr_cont(" soft reserved"); break;
case E820_TYPE_ACPI: pr_cont(" ACPI data"); break;
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 09/29] x86/boot/e820: Call the PCI gap a 'gap' in the boot log printout
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (7 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 08/29] x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 10/29] x86/boot/e820: Use 'u64' consistently instead of 'unsigned long long' Ingo Molnar
` (20 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
It is a bit weird and inconsistent that the PCI gap is
advertised during bootup as 'mem'ory:
[mem 0xc0000000-0xfed1bfff] available for PCI devices
^^^
It's not really memory, it's a gap that PCI devices can decode
and use and they often do not map it to any memory themselves.
So advertise it for what it is, a gap:
[gap 0xc0000000-0xfed1bfff] available for PCI devices
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 99f997ae88dc..3eab0908ca71 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -741,7 +741,7 @@ __init void e820__setup_pci_gap(void)
*/
pci_mem_start = gapstart;
- pr_info("[mem %#010lx-%#010lx] available for PCI devices\n",
+ pr_info("[gap %#010lx-%#010lx] available for PCI devices\n",
gapstart, gapstart + gapsize - 1);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 10/29] x86/boot/e820: Use 'u64' consistently instead of 'unsigned long long'
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (8 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 09/29] x86/boot/e820: Call the PCI gap a 'gap' in the boot log printout Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-22 6:44 ` Andy Shevchenko
2025-04-21 18:51 ` [PATCH 11/29] x86/boot/e820: Remove pointless early_panic() indirection Ingo Molnar
` (19 subsequent siblings)
29 siblings, 1 reply; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
There's a number of structure fields and local variables related
to E820 entry physical addresses that are defined as 'unsigned long long',
but then are compared to u64 fields.
Make the types all consistently u64.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 3eab0908ca71..7d100e653554 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -338,7 +338,7 @@ struct change_member {
/* Pointer to the original entry: */
struct e820_entry *entry;
/* Address for this change point: */
- unsigned long long addr;
+ u64 addr;
};
static struct change_member change_point_list[2*E820_MAX_ENTRIES] __initdata;
@@ -386,7 +386,7 @@ int __init e820__update_table(struct e820_table *table)
struct e820_entry *entries = table->entries;
u32 max_nr_entries = ARRAY_SIZE(table->entries);
enum e820_type current_type, last_type;
- unsigned long long last_addr;
+ u64 last_addr;
u32 new_nr_entries, overlap_entries;
u32 i, chg_idx, chg_nr;
@@ -683,13 +683,13 @@ static void __init e820__update_table_kexec(void)
*/
static int __init e820_search_gap(unsigned long *gapstart, unsigned long *gapsize)
{
- unsigned long long last = MAX_GAP_END;
+ u64 last = MAX_GAP_END;
int i = e820_table->nr_entries;
int found = 0;
while (--i >= 0) {
- unsigned long long start = e820_table->entries[i].addr;
- unsigned long long end = start + e820_table->entries[i].size;
+ u64 start = e820_table->entries[i].addr;
+ u64 end = start + e820_table->entries[i].size;
/*
* Since "last" is at most 4GB, we know we'll
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 11/29] x86/boot/e820: Remove pointless early_panic() indirection
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (9 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 10/29] x86/boot/e820: Use 'u64' consistently instead of 'unsigned long long' Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 12/29] x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations Ingo Molnar
` (18 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
early_panic() is a pointless wrapper around panic():
static void __init early_panic(char *msg)
{
early_printk(msg);
panic(msg);
}
panic() will already do a printk() of 'msg', and an early_printk() if
earlyprintk is enabled. There's no need to print it separately.
Remove the function.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 7d100e653554..8fcf4edd832e 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -935,12 +935,6 @@ unsigned long __init e820__end_of_low_ram_pfn(void)
return e820__end_ram_pfn(1UL << (32 - PAGE_SHIFT));
}
-static void __init early_panic(char *msg)
-{
- early_printk(msg);
- panic(msg);
-}
-
static int userdef __initdata;
/* The "mem=nopentium" boot option disables 4MB page tables on 32-bit kernels: */
@@ -1060,7 +1054,7 @@ void __init e820__finish_early_params(void)
{
if (userdef) {
if (e820__update_table(e820_table) < 0)
- early_panic("Invalid user supplied memory map");
+ panic("Invalid user supplied memory map");
pr_info("user-defined physical RAM map:\n");
e820__print_table("user");
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 12/29] x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (10 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 11/29] x86/boot/e820: Remove pointless early_panic() indirection Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 13/29] x86/boot/e820: Improve e820_print_type() messages Ingo Molnar
` (17 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
So the E820 code has a rather confusing area of code at around
e820__reserve_resources(), which is, by its plain reading,
rather self-contradictory. For example, the comment explaining
e820__reserve_resources() claims:
- '* Mark E820 reserved areas as busy for the resource manager'
By 'E820 reserved areas' one can naively conclude that it's
talking about E820_TYPE_RESERVED areas - while those areas
are treated in exactly the opposite fashion by do_mark_busy():
switch (type) {
case E820_TYPE_RESERVED:
case E820_TYPE_SOFT_RESERVED:
case E820_TYPE_PRAM:
case E820_TYPE_PMEM:
return false;
Ie. E820_TYPE_RESERVED areas are *not* marked busy for the
resource manager, because E820_TYPE_RESERVED areas are
device regions that might eventually be claimed by a device driver.
This type of confusion permeates this whole area of code,
making it exceedingly difficult to read (for me at least).
So untangle it bit by bit:
- Instead of talking about ambiguous 'reserved areas',
talk about 'E820 device address regions' instead,
and 'register'/'lock' them.
- The do_mark_busy() function is a misnomer as well, because despite
its name it 'does' nothing - it only determines what type
of resource handling an E820 type should receive from the
kernel. Rename it to e820_device_region() and negate its
meaning, to avoid the 'busy/reserved' confusion. Because
that's what this code is really about: filtering out
device regions such as E820_TYPE_RESERVED, E820_TYPE_PRAM,
E820_TYPE_PMEM, etc., and allowing them to be claimed
by device drivers later on.
- All other E820 regions (system regions) are registered and
locked early on, before the PCI resource manager does its
search for device BAR addresses, etc.
Also fix this somewhat misleading comment:
/*
* Try to bump up RAM regions to reasonable boundaries, to
* avoid stolen RAM:
*/
and explain that here we register artificial 'gap' resources
at the end of suspiciously sized RAM regions, as heuristics
to try to avoid buggy firmware with undeclared 'stolen RAM' regions:
/*
* Create additional 'gaps' at the end of RAM regions,
* rounding them up to 64k/1MB/64MB boundaries, should
* they be weirdly sized, and register extra, locked
* resource regions for them, to make sure drivers
* won't claim those addresses.
*
* These are basically blind guesses and heuristics to
* avoid resource conflicts with broken firmware that
* doesn't properly list 'stolen RAM' as a system region
* in the E820 map.
*/
Also improve the printout of this extra resource a bit: make the
message more unambiguous, and upgrade it from pr_debug() (where
very few people will see it), to pr_info() (where it will make
it into the syslog on default distro configs).
Also fix spelling and improve comment placement.
No change in functionality intended.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 55 +++++++++++++++++++++++++++++++++-----------------
1 file changed, 37 insertions(+), 18 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 8fcf4edd832e..484419711ecf 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1106,37 +1106,44 @@ static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
}
}
-static bool __init do_mark_busy(enum e820_type type, struct resource *res)
+/*
+ * We assign one resource entry for each E820 map entry:
+ */
+static struct resource __initdata *e820_res;
+
+/*
+ * Is this a device address region that should not be marked busy?
+ * (Versus system address regions that we register & lock early.)
+ */
+static bool __init e820_device_region(enum e820_type type, struct resource *res)
{
- /* this is the legacy bios/dos rom-shadow + mmio region */
+ /* This is the legacy BIOS/DOS ROM-shadow + MMIO region: */
if (res->start < (1ULL<<20))
- return true;
+ return false;
/*
* Treat persistent memory and other special memory ranges like
- * device memory, i.e. reserve it for exclusive use of a driver
+ * device memory, i.e. keep it available for exclusive use of a
+ * driver:
*/
switch (type) {
case E820_TYPE_RESERVED:
case E820_TYPE_SOFT_RESERVED:
case E820_TYPE_PRAM:
case E820_TYPE_PMEM:
- return false;
+ return true;
case E820_TYPE_RAM:
case E820_TYPE_ACPI:
case E820_TYPE_NVS:
case E820_TYPE_UNUSABLE:
default:
- return true;
+ return false;
}
}
/*
- * Mark E820 reserved areas as busy for the resource manager:
+ * Mark E820 system regions as busy for the resource manager:
*/
-
-static struct resource __initdata *e820_res;
-
void __init e820__reserve_resources(void)
{
int i;
@@ -1162,18 +1169,18 @@ void __init e820__reserve_resources(void)
res->desc = e820_type_to_iores_desc(entry);
/*
- * Don't register the region that could be conflicted with
- * PCI device BAR resources and insert them later in
- * pcibios_resource_survey():
+ * Skip and don't register device regions that could be conflicted
+ * with PCI device BAR resources. They get inserted later in
+ * pcibios_resource_survey() -> e820__reserve_resources_late():
*/
- if (do_mark_busy(entry->type, res)) {
+ if (!e820_device_region(entry->type, res)) {
res->flags |= IORESOURCE_BUSY;
insert_resource(&iomem_resource, res);
}
res++;
}
- /* Expose the kexec e820 table to the sysfs. */
+ /* Expose the kexec e820 table to sysfs: */
for (i = 0; i < e820_table_kexec->nr_entries; i++) {
struct e820_entry *entry = e820_table_kexec->entries + i;
@@ -1207,6 +1214,10 @@ void __init e820__reserve_resources_late(void)
int i;
struct resource *res;
+ /*
+ * Register device address regions listed in the E820 map,
+ * these can be claimed by device drivers later on:
+ */
res = e820_res;
for (i = 0; i < e820_table->nr_entries; i++) {
if (!res->parent && res->end)
@@ -1215,8 +1226,16 @@ void __init e820__reserve_resources_late(void)
}
/*
- * Try to bump up RAM regions to reasonable boundaries, to
- * avoid stolen RAM:
+ * Create additional 'gaps' at the end of RAM regions,
+ * rounding them up to 64k/1MB/64MB boundaries, should
+ * they be weirdly sized, and register extra, locked
+ * resource regions for them, to make sure drivers
+ * won't claim those addresses.
+ *
+ * These are basically blind guesses and heuristics to
+ * avoid resource conflicts with broken firmware that
+ * doesn't properly list 'stolen RAM' as a system region
+ * in the E820 map.
*/
for (i = 0; i < e820_table->nr_entries; i++) {
struct e820_entry *entry = &e820_table->entries[i];
@@ -1232,7 +1251,7 @@ void __init e820__reserve_resources_late(void)
if (start >= end)
continue;
- printk(KERN_DEBUG "e820: reserve RAM buffer [mem %#010llx-%#010llx]\n", start, end);
+ pr_info("e820: register RAM buffer resource [mem %#010llx-%#010llx]\n", start, end);
reserve_region_with_split(&iomem_resource, start, end, "RAM buffer");
}
}
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 13/29] x86/boot/e820: Improve e820_print_type() messages
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (11 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 12/29] x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 14/29] x86/boot/e820: Clean up __e820__range_add() a bit Ingo Molnar
` (16 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
For E820_TYPE_RESERVED, print:
'reserved' -> 'device reserved'
For E820_TYPE_PRAM and E820_TYPE_PMEM:
'persistent' -> 'persistent RAM'
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 484419711ecf..ae05161896d4 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -187,15 +187,15 @@ void __init e820__range_add(u64 start, u64 size, enum e820_type type)
static void __init e820_print_type(enum e820_type type)
{
switch (type) {
- case E820_TYPE_RAM: pr_cont(" kernel usable RAM"); break;
- case E820_TYPE_RESERVED: pr_cont(" reserved"); break;
- case E820_TYPE_SOFT_RESERVED: pr_cont(" soft reserved"); break;
- case E820_TYPE_ACPI: pr_cont(" ACPI data"); break;
- case E820_TYPE_NVS: pr_cont(" ACPI NVS"); break;
- case E820_TYPE_UNUSABLE: pr_cont(" unusable"); break;
+ case E820_TYPE_RAM: pr_cont(" kernel usable RAM"); break;
+ case E820_TYPE_RESERVED: pr_cont(" device reserved"); break;
+ case E820_TYPE_SOFT_RESERVED: pr_cont(" soft reserved"); break;
+ case E820_TYPE_ACPI: pr_cont(" ACPI data"); break;
+ case E820_TYPE_NVS: pr_cont(" ACPI NVS"); break;
+ case E820_TYPE_UNUSABLE: pr_cont(" unusable"); break;
case E820_TYPE_PMEM: /* Fall through: */
- case E820_TYPE_PRAM: pr_cont(" persistent (type %u)", type); break;
- default: pr_cont(" type %u", type); break;
+ case E820_TYPE_PRAM: pr_cont(" persistent RAM (type %u)", type); break;
+ default: pr_cont(" type %u", type); break;
}
}
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 14/29] x86/boot/e820: Clean up __e820__range_add() a bit
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (12 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 13/29] x86/boot/e820: Improve e820_print_type() messages Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-22 10:10 ` Andy Shevchenko
2025-04-21 18:51 ` [PATCH 15/29] x86/boot/e820: Clean up __refdata use " Ingo Molnar
` (15 subsequent siblings)
29 siblings, 1 reply; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
- Use 'idx' index variable instead of a weird 'x'
- Make the error message E820-specific
- Group the code a bit better
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index ae05161896d4..6e626c4a3817 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -165,17 +165,18 @@ int e820__get_entry_type(u64 start, u64 end)
*/
static void __init __e820__range_add(struct e820_table *table, u64 start, u64 size, enum e820_type type)
{
- int x = table->nr_entries;
+ int idx = table->nr_entries;
- if (x >= ARRAY_SIZE(table->entries)) {
- pr_err("too many entries; ignoring [mem %#010llx-%#010llx]\n",
- start, start + size - 1);
+ if (idx >= ARRAY_SIZE(table->entries)) {
+ pr_err("too many E820 table entries; ignoring [mem %#010llx-%#010llx]\n",
+ start, start + size-1);
return;
}
- table->entries[x].addr = start;
- table->entries[x].size = size;
- table->entries[x].type = type;
+ table->entries[idx].addr = start;
+ table->entries[idx].size = size;
+ table->entries[idx].type = type;
+
table->nr_entries++;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 15/29] x86/boot/e820: Clean up __refdata use a bit
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (13 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 14/29] x86/boot/e820: Clean up __e820__range_add() a bit Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 16/29] x86/boot/e820: Remove unnecessary header inclusions Ingo Molnar
` (14 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
So __refdata, like __init, is more of a storage class specifier,
so move the attribute in front of the type, not after the variable
name. This also aligns it vertically.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 6e626c4a3817..cb4e5349fd22 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -60,9 +60,9 @@ static struct e820_table e820_table_init __initdata;
static struct e820_table e820_table_kexec_init __initdata;
static struct e820_table e820_table_firmware_init __initdata;
-struct e820_table *e820_table __refdata = &e820_table_init;
-struct e820_table *e820_table_kexec __refdata = &e820_table_kexec_init;
-struct e820_table *e820_table_firmware __refdata = &e820_table_firmware_init;
+__refdata struct e820_table *e820_table = &e820_table_init;
+__refdata struct e820_table *e820_table_kexec = &e820_table_kexec_init;
+__refdata struct e820_table *e820_table_firmware = &e820_table_firmware_init;
/* For PCI or other memory-mapped resources */
unsigned long pci_mem_start = 0xaeedbabe;
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 16/29] x86/boot/e820: Remove unnecessary header inclusions
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (14 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 15/29] x86/boot/e820: Clean up __refdata use " Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 17/29] x86/boot/e820: Standardize e820 table index variable names under 'idx' Ingo Molnar
` (13 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index cb4e5349fd22..158f9a46ba55 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -9,13 +9,11 @@
* quirks and other tweaks, and feeds that into the generic Linux memory
* allocation code routines via a platform independent interface (memblock, etc.).
*/
-#include <linux/crash_dump.h>
#include <linux/memblock.h>
#include <linux/suspend.h>
#include <linux/acpi.h>
#include <linux/firmware-map.h>
#include <linux/sort.h>
-#include <linux/memory_hotplug.h>
#include <asm/e820/api.h>
#include <asm/setup.h>
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 17/29] x86/boot/e820: Standardize e820 table index variable names under 'idx'
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (15 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 16/29] x86/boot/e820: Remove unnecessary header inclusions Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-22 10:23 ` Andy Shevchenko
2025-04-21 18:51 ` [PATCH 18/29] x86/boot/e820: Change struct e820_table::nr_entries type from __u32 to u32 Ingo Molnar
` (12 subsequent siblings)
29 siblings, 1 reply; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 114 ++++++++++++++++++++++++-------------------------
1 file changed, 57 insertions(+), 57 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 158f9a46ba55..919950d0f03d 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -75,10 +75,10 @@ EXPORT_SYMBOL(pci_mem_start);
static bool _e820__mapped_any(struct e820_table *table,
u64 start, u64 end, enum e820_type type)
{
- int i;
+ int idx;
- for (i = 0; i < table->nr_entries; i++) {
- struct e820_entry *entry = &table->entries[i];
+ for (idx = 0; idx < table->nr_entries; idx++) {
+ struct e820_entry *entry = &table->entries[idx];
if (type && entry->type != type)
continue;
@@ -110,10 +110,10 @@ EXPORT_SYMBOL_GPL(e820__mapped_any);
static struct e820_entry *__e820__mapped_all(u64 start, u64 end,
enum e820_type type)
{
- int i;
+ int idx;
- for (i = 0; i < e820_table->nr_entries; i++) {
- struct e820_entry *entry = &e820_table->entries[i];
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
+ struct e820_entry *entry = &e820_table->entries[idx];
if (type && entry->type != type)
continue;
@@ -236,10 +236,10 @@ static void e820_print_size(u64 size)
static void __init e820__print_table(const char *who)
{
u64 range_end_prev = 0;
- int i;
+ int idx;
- for (i = 0; i < e820_table->nr_entries; i++) {
- struct e820_entry *entry = e820_table->entries + i;
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
+ struct e820_entry *entry = e820_table->entries + idx;
u64 range_start, range_end;
range_start = entry->addr;
@@ -387,7 +387,7 @@ int __init e820__update_table(struct e820_table *table)
enum e820_type current_type, last_type;
u64 last_addr;
u32 new_nr_entries, overlap_entries;
- u32 i, chg_idx, chg_nr;
+ u32 idx, chg_idx, chg_nr;
/* If there's only one memory region, don't bother: */
if (table->nr_entries < 2)
@@ -396,26 +396,26 @@ int __init e820__update_table(struct e820_table *table)
BUG_ON(table->nr_entries > max_nr_entries);
/* Bail out if we find any unreasonable addresses in the map: */
- for (i = 0; i < table->nr_entries; i++) {
- if (entries[i].addr + entries[i].size < entries[i].addr)
+ for (idx = 0; idx < table->nr_entries; idx++) {
+ if (entries[idx].addr + entries[idx].size < entries[idx].addr)
return -1;
}
/* Create pointers for initial change-point information (for sorting): */
- for (i = 0; i < 2 * table->nr_entries; i++)
- change_point[i] = &change_point_list[i];
+ for (idx = 0; idx < 2 * table->nr_entries; idx++)
+ change_point[idx] = &change_point_list[idx];
/*
* Record all known change-points (starting and ending addresses),
* omitting empty memory regions:
*/
chg_idx = 0;
- for (i = 0; i < table->nr_entries; i++) {
- if (entries[i].size != 0) {
- change_point[chg_idx]->addr = entries[i].addr;
- change_point[chg_idx++]->entry = &entries[i];
- change_point[chg_idx]->addr = entries[i].addr + entries[i].size;
- change_point[chg_idx++]->entry = &entries[i];
+ for (idx = 0; idx < table->nr_entries; idx++) {
+ if (entries[idx].size != 0) {
+ change_point[chg_idx]->addr = entries[idx].addr;
+ change_point[chg_idx++]->entry = &entries[idx];
+ change_point[chg_idx]->addr = entries[idx].addr + entries[idx].size;
+ change_point[chg_idx++]->entry = &entries[idx];
}
}
chg_nr = chg_idx;
@@ -437,9 +437,9 @@ int __init e820__update_table(struct e820_table *table)
overlap_list[overlap_entries++] = change_point[chg_idx]->entry;
} else {
/* Remove entry from list (order independent, so swap with last): */
- for (i = 0; i < overlap_entries; i++) {
- if (overlap_list[i] == change_point[chg_idx]->entry)
- overlap_list[i] = overlap_list[overlap_entries-1];
+ for (idx = 0; idx < overlap_entries; idx++) {
+ if (overlap_list[idx] == change_point[chg_idx]->entry)
+ overlap_list[idx] = overlap_list[overlap_entries-1];
}
overlap_entries--;
}
@@ -449,9 +449,9 @@ int __init e820__update_table(struct e820_table *table)
* 1=usable, 2,3,4,4+=unusable)
*/
current_type = 0;
- for (i = 0; i < overlap_entries; i++) {
- if (overlap_list[i]->type > current_type)
- current_type = overlap_list[i]->type;
+ for (idx = 0; idx < overlap_entries; idx++) {
+ if (overlap_list[idx]->type > current_type)
+ current_type = overlap_list[idx]->type;
}
/* Continue building up new map based on this information: */
@@ -524,7 +524,7 @@ static u64 __init
__e820__range_update(struct e820_table *table, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type)
{
u64 end;
- unsigned int i;
+ unsigned int idx;
u64 real_updated_size = 0;
BUG_ON(old_type == new_type);
@@ -539,8 +539,8 @@ __e820__range_update(struct e820_table *table, u64 start, u64 size, enum e820_ty
e820_print_type(new_type);
pr_cont("\n");
- for (i = 0; i < table->nr_entries; i++) {
- struct e820_entry *entry = &table->entries[i];
+ for (idx = 0; idx < table->nr_entries; idx++) {
+ struct e820_entry *entry = &table->entries[idx];
u64 final_start, final_end;
u64 entry_end;
@@ -602,7 +602,7 @@ u64 __init e820__range_update_table(struct e820_table *t, u64 start, u64 size,
/* Remove a range of memory from the E820 table: */
u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type)
{
- int i;
+ int idx;
u64 end;
u64 real_removed_size = 0;
@@ -615,8 +615,8 @@ u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool
e820_print_type(old_type);
pr_cont("\n");
- for (i = 0; i < e820_table->nr_entries; i++) {
- struct e820_entry *entry = &e820_table->entries[i];
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
+ struct e820_entry *entry = &e820_table->entries[idx];
u64 final_start, final_end;
u64 entry_end;
@@ -683,12 +683,12 @@ static void __init e820__update_table_kexec(void)
static int __init e820_search_gap(unsigned long *gapstart, unsigned long *gapsize)
{
u64 last = MAX_GAP_END;
- int i = e820_table->nr_entries;
+ int idx = e820_table->nr_entries;
int found = 0;
- while (--i >= 0) {
- u64 start = e820_table->entries[i].addr;
- u64 end = start + e820_table->entries[i].size;
+ while (--idx >= 0) {
+ u64 start = e820_table->entries[idx].addr;
+ u64 end = start + e820_table->entries[idx].size;
/*
* Since "last" is at most 4GB, we know we'll
@@ -814,11 +814,11 @@ void __init e820__memory_setup_extended(u64 phys_addr, u32 data_len)
*/
void __init e820__register_nosave_regions(unsigned long limit_pfn)
{
- int i;
+ int idx;
u64 last_addr = 0;
- for (i = 0; i < e820_table->nr_entries; i++) {
- struct e820_entry *entry = &e820_table->entries[i];
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
+ struct e820_entry *entry = &e820_table->entries[idx];
if (entry->type != E820_TYPE_RAM)
continue;
@@ -839,10 +839,10 @@ void __init e820__register_nosave_regions(unsigned long limit_pfn)
*/
static int __init e820__register_nvs_regions(void)
{
- int i;
+ int idx;
- for (i = 0; i < e820_table->nr_entries; i++) {
- struct e820_entry *entry = &e820_table->entries[i];
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
+ struct e820_entry *entry = &e820_table->entries[idx];
if (entry->type == E820_TYPE_NVS)
acpi_nvs_register(entry->addr, entry->size);
@@ -890,12 +890,12 @@ u64 __init e820__memblock_alloc_reserved(u64 size, u64 align)
*/
static unsigned long __init e820__end_ram_pfn(unsigned long limit_pfn)
{
- int i;
+ int idx;
unsigned long last_pfn = 0;
unsigned long max_arch_pfn = MAX_ARCH_PFN;
- for (i = 0; i < e820_table->nr_entries; i++) {
- struct e820_entry *entry = &e820_table->entries[i];
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
+ struct e820_entry *entry = &e820_table->entries[idx];
unsigned long start_pfn;
unsigned long end_pfn;
@@ -1145,7 +1145,7 @@ static bool __init e820_device_region(enum e820_type type, struct resource *res)
*/
void __init e820__reserve_resources(void)
{
- int i;
+ int idx;
struct resource *res;
u64 end;
@@ -1153,8 +1153,8 @@ void __init e820__reserve_resources(void)
SMP_CACHE_BYTES);
e820_res = res;
- for (i = 0; i < e820_table->nr_entries; i++) {
- struct e820_entry *entry = e820_table->entries + i;
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
+ struct e820_entry *entry = e820_table->entries + idx;
end = entry->addr + entry->size - 1;
if (end != (resource_size_t)end) {
@@ -1180,8 +1180,8 @@ void __init e820__reserve_resources(void)
}
/* Expose the kexec e820 table to sysfs: */
- for (i = 0; i < e820_table_kexec->nr_entries; i++) {
- struct e820_entry *entry = e820_table_kexec->entries + i;
+ for (idx = 0; idx < e820_table_kexec->nr_entries; idx++) {
+ struct e820_entry *entry = e820_table_kexec->entries + idx;
firmware_map_add_early(entry->addr, entry->addr + entry->size, e820_type_to_string(entry));
}
@@ -1210,7 +1210,7 @@ static unsigned long __init ram_alignment(resource_size_t pos)
void __init e820__reserve_resources_late(void)
{
- int i;
+ int idx;
struct resource *res;
/*
@@ -1218,7 +1218,7 @@ void __init e820__reserve_resources_late(void)
* these can be claimed by device drivers later on:
*/
res = e820_res;
- for (i = 0; i < e820_table->nr_entries; i++) {
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
if (!res->parent && res->end)
insert_resource_expand_to_fit(&iomem_resource, res);
res++;
@@ -1236,8 +1236,8 @@ void __init e820__reserve_resources_late(void)
* doesn't properly list 'stolen RAM' as a system region
* in the E820 map.
*/
- for (i = 0; i < e820_table->nr_entries; i++) {
- struct e820_entry *entry = &e820_table->entries[i];
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
+ struct e820_entry *entry = &e820_table->entries[idx];
u64 start, end;
if (entry->type != E820_TYPE_RAM)
@@ -1314,7 +1314,7 @@ void __init e820__memory_setup(void)
void __init e820__memblock_setup(void)
{
- int i;
+ int idx;
u64 end;
#ifdef CONFIG_MEMORY_HOTPLUG
@@ -1358,8 +1358,8 @@ void __init e820__memblock_setup(void)
*/
memblock_allow_resize();
- for (i = 0; i < e820_table->nr_entries; i++) {
- struct e820_entry *entry = &e820_table->entries[i];
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
+ struct e820_entry *entry = &e820_table->entries[idx];
end = entry->addr + entry->size;
if (end != (resource_size_t)end)
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 18/29] x86/boot/e820: Change struct e820_table::nr_entries type from __u32 to u32
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (16 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 17/29] x86/boot/e820: Standardize e820 table index variable names under 'idx' Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 19/29] x86/boot/e820: Standardize e820 table index variable types under 'u32' Ingo Molnar
` (11 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
__u32 is for UAPI headers, and this definition is the only place
in the kernel-internal E820 code that uses __u32. Change it to u32.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/include/asm/e820/types.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/e820/types.h b/arch/x86/include/asm/e820/types.h
index 80c4a7266629..df12f7ee75d3 100644
--- a/arch/x86/include/asm/e820/types.h
+++ b/arch/x86/include/asm/e820/types.h
@@ -83,7 +83,7 @@ struct e820_entry {
* The whole array of E820 entries:
*/
struct e820_table {
- __u32 nr_entries;
+ u32 nr_entries;
struct e820_entry entries[E820_MAX_ENTRIES];
};
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 19/29] x86/boot/e820: Standardize e820 table index variable types under 'u32'
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (17 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 18/29] x86/boot/e820: Change struct e820_table::nr_entries type from __u32 to u32 Ingo Molnar
@ 2025-04-21 18:51 ` Ingo Molnar
2025-04-22 15:13 ` Andy Shevchenko
2025-04-21 18:52 ` [PATCH 20/29] x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit Ingo Molnar
` (10 subsequent siblings)
29 siblings, 1 reply; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:51 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
So we have 'idx' types of 'int' and 'unsigned int', and sometimes
we assign 'u32' fields such as e820_table::nr_entries to these 'int'
values.
While there's no real risk of overflow with these tables, make it
all cleaner by standardizing on a single type: u32.
This also happens to shrink the code a bit:
text data bss dec hex filename
7745 44072 0 51817 ca69 e820.o.before
7613 44072 0 51685 c9e5 e820.o.after
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 919950d0f03d..d55cb77537e7 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -75,7 +75,7 @@ EXPORT_SYMBOL(pci_mem_start);
static bool _e820__mapped_any(struct e820_table *table,
u64 start, u64 end, enum e820_type type)
{
- int idx;
+ u32 idx;
for (idx = 0; idx < table->nr_entries; idx++) {
struct e820_entry *entry = &table->entries[idx];
@@ -110,7 +110,7 @@ EXPORT_SYMBOL_GPL(e820__mapped_any);
static struct e820_entry *__e820__mapped_all(u64 start, u64 end,
enum e820_type type)
{
- int idx;
+ u32 idx;
for (idx = 0; idx < e820_table->nr_entries; idx++) {
struct e820_entry *entry = &e820_table->entries[idx];
@@ -163,7 +163,7 @@ int e820__get_entry_type(u64 start, u64 end)
*/
static void __init __e820__range_add(struct e820_table *table, u64 start, u64 size, enum e820_type type)
{
- int idx = table->nr_entries;
+ u32 idx = table->nr_entries;
if (idx >= ARRAY_SIZE(table->entries)) {
pr_err("too many E820 table entries; ignoring [mem %#010llx-%#010llx]\n",
@@ -236,7 +236,7 @@ static void e820_print_size(u64 size)
static void __init e820__print_table(const char *who)
{
u64 range_end_prev = 0;
- int idx;
+ u32 idx;
for (idx = 0; idx < e820_table->nr_entries; idx++) {
struct e820_entry *entry = e820_table->entries + idx;
@@ -524,7 +524,7 @@ static u64 __init
__e820__range_update(struct e820_table *table, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type)
{
u64 end;
- unsigned int idx;
+ u32 idx;
u64 real_updated_size = 0;
BUG_ON(old_type == new_type);
@@ -602,7 +602,7 @@ u64 __init e820__range_update_table(struct e820_table *t, u64 start, u64 size,
/* Remove a range of memory from the E820 table: */
u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type)
{
- int idx;
+ u32 idx;
u64 end;
u64 real_removed_size = 0;
@@ -814,7 +814,7 @@ void __init e820__memory_setup_extended(u64 phys_addr, u32 data_len)
*/
void __init e820__register_nosave_regions(unsigned long limit_pfn)
{
- int idx;
+ u32 idx;
u64 last_addr = 0;
for (idx = 0; idx < e820_table->nr_entries; idx++) {
@@ -839,7 +839,7 @@ void __init e820__register_nosave_regions(unsigned long limit_pfn)
*/
static int __init e820__register_nvs_regions(void)
{
- int idx;
+ u32 idx;
for (idx = 0; idx < e820_table->nr_entries; idx++) {
struct e820_entry *entry = &e820_table->entries[idx];
@@ -890,7 +890,7 @@ u64 __init e820__memblock_alloc_reserved(u64 size, u64 align)
*/
static unsigned long __init e820__end_ram_pfn(unsigned long limit_pfn)
{
- int idx;
+ u32 idx;
unsigned long last_pfn = 0;
unsigned long max_arch_pfn = MAX_ARCH_PFN;
@@ -1145,7 +1145,7 @@ static bool __init e820_device_region(enum e820_type type, struct resource *res)
*/
void __init e820__reserve_resources(void)
{
- int idx;
+ u32 idx;
struct resource *res;
u64 end;
@@ -1210,7 +1210,7 @@ static unsigned long __init ram_alignment(resource_size_t pos)
void __init e820__reserve_resources_late(void)
{
- int idx;
+ u32 idx;
struct resource *res;
/*
@@ -1314,7 +1314,7 @@ void __init e820__memory_setup(void)
void __init e820__memblock_setup(void)
{
- int idx;
+ u32 idx;
u64 end;
#ifdef CONFIG_MEMORY_HOTPLUG
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 20/29] x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (18 preceding siblings ...)
2025-04-21 18:51 ` [PATCH 19/29] x86/boot/e820: Standardize e820 table index variable types under 'u32' Ingo Molnar
@ 2025-04-21 18:52 ` Ingo Molnar
2025-04-22 16:37 ` Andy Shevchenko
2025-04-21 18:52 ` [PATCH 21/29] x86/boot/e820: Change e820_search_gap() to search for the highest-address PCI gap Ingo Molnar
` (9 subsequent siblings)
29 siblings, 1 reply; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:52 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
Use a bit more readable variable names, we haven't run out of
underscore characters in the kernel yet.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index d55cb77537e7..01d50331856c 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -680,7 +680,7 @@ static void __init e820__update_table_kexec(void)
/*
* Search for a gap in the E820 memory space from 0 to MAX_GAP_END (4GB).
*/
-static int __init e820_search_gap(unsigned long *gapstart, unsigned long *gapsize)
+static int __init e820_search_gap(unsigned long *gap_start, unsigned long *gap_size)
{
u64 last = MAX_GAP_END;
int idx = e820_table->nr_entries;
@@ -697,9 +697,9 @@ static int __init e820_search_gap(unsigned long *gapstart, unsigned long *gapsiz
if (last > end) {
unsigned long gap = last - end;
- if (gap >= *gapsize) {
- *gapsize = gap;
- *gapstart = end;
+ if (gap >= *gap_size) {
+ *gap_size = gap;
+ *gap_start = end;
found = 1;
}
}
@@ -719,29 +719,29 @@ static int __init e820_search_gap(unsigned long *gapstart, unsigned long *gapsiz
*/
__init void e820__setup_pci_gap(void)
{
- unsigned long gapstart, gapsize;
+ unsigned long gap_start, gap_size;
int found;
- gapsize = 0x400000;
- found = e820_search_gap(&gapstart, &gapsize);
+ gap_size = 0x400000;
+ found = e820_search_gap(&gap_start, &gap_size);
if (!found) {
#ifdef CONFIG_X86_64
- gapstart = (max_pfn << PAGE_SHIFT) + 1024*1024;
+ gap_start = (max_pfn << PAGE_SHIFT) + 1024*1024;
pr_err("Cannot find an available gap in the 32-bit address range\n");
pr_err("PCI devices with unassigned 32-bit BARs may not work!\n");
#else
- gapstart = 0x10000000;
+ gap_start = 0x10000000;
#endif
}
/*
* e820__reserve_resources_late() protects stolen RAM already:
*/
- pci_mem_start = gapstart;
+ pci_mem_start = gap_start;
pr_info("[gap %#010lx-%#010lx] available for PCI devices\n",
- gapstart, gapstart + gapsize - 1);
+ gap_start, gap_start + gap_size - 1);
}
/*
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 21/29] x86/boot/e820: Change e820_search_gap() to search for the highest-address PCI gap
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (19 preceding siblings ...)
2025-04-21 18:52 ` [PATCH 20/29] x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit Ingo Molnar
@ 2025-04-21 18:52 ` Ingo Molnar
2025-04-21 18:52 ` [PATCH 22/29] x86/boot/e820: Rename gap_start/gap_size to max_gap_start/max_gap_start in e820_search_gap() et al Ingo Molnar
` (8 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:52 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
Right now the main x86 function that determines the position and size
of the 'PCI gap', e820_search_gap(), has this curious property:
while (--idx >= 0) {
...
if (gap >= *gap_size) {
I.e. it will iterate the E820 table backwards, from its end to the beginning,
and will search for larger and larger gaps in the memory map below 4GB,
until it finishes with the table.
This logic will, should there be two gaps with the same size, pick the
one with the lower physical address - which is contrary to usual
practice that the PCI gap is just below 4GB.
Furthermore, the commit that introduced this weird logic 16 years ago:
3381959da5a0 ("x86: cleanup e820_setup_gap(), add e820_search_gap(), v2")
- if (gap > gapsize) {
+ if (gap >= *gapsize) {
didn't even declare this change, the title says it's a cleanup,
and the changelog declares it as a preparatory refactoring
for a later bugfix:
809d9a8f93bd ("x86/PCI: ACPI based PCI gap calculation")
which bugfix was reverted only 1 day later without much of an
explanation, and was never reintroduced:
58b6e5538460 ("Revert "x86/PCI: ACPI based PCI gap calculation"")
So based on the Git archeology and by the plain reading of the
code I declare this '>=' change an unintended bug and side effect.
Change it to '>' again.
It should not make much of a difference in practice, as the
likelihood of having *two* largest gaps with exactly the same
size are very low outside of weird user-provided memory maps.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 01d50331856c..24b2e8a93853 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -697,7 +697,7 @@ static int __init e820_search_gap(unsigned long *gap_start, unsigned long *gap_s
if (last > end) {
unsigned long gap = last - end;
- if (gap >= *gap_size) {
+ if (gap > *gap_size) {
*gap_size = gap;
*gap_start = end;
found = 1;
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 22/29] x86/boot/e820: Rename gap_start/gap_size to max_gap_start/max_gap_start in e820_search_gap() et al
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (20 preceding siblings ...)
2025-04-21 18:52 ` [PATCH 21/29] x86/boot/e820: Change e820_search_gap() to search for the highest-address PCI gap Ingo Molnar
@ 2025-04-21 18:52 ` Ingo Molnar
2025-04-21 18:52 ` [PATCH 23/29] x86/boot/e820: Simplify & clarify __e820__range_add() a bit Ingo Molnar
` (7 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:52 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
The PCI gap searching functions pass around pointers to the
gap_start/gap_size variables, which refer to the maximum
size gap found so far.
Rename the variables to say so, and disambiguate their namespace
from 'current gap' variables.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 24b2e8a93853..d0f9c544a14c 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -680,7 +680,7 @@ static void __init e820__update_table_kexec(void)
/*
* Search for a gap in the E820 memory space from 0 to MAX_GAP_END (4GB).
*/
-static int __init e820_search_gap(unsigned long *gap_start, unsigned long *gap_size)
+static int __init e820_search_gap(unsigned long *max_gap_start, unsigned long *max_gap_size)
{
u64 last = MAX_GAP_END;
int idx = e820_table->nr_entries;
@@ -697,9 +697,9 @@ static int __init e820_search_gap(unsigned long *gap_start, unsigned long *gap_s
if (last > end) {
unsigned long gap = last - end;
- if (gap > *gap_size) {
- *gap_size = gap;
- *gap_start = end;
+ if (gap > *max_gap_size) {
+ *max_gap_size = gap;
+ *max_gap_start = end;
found = 1;
}
}
@@ -719,29 +719,29 @@ static int __init e820_search_gap(unsigned long *gap_start, unsigned long *gap_s
*/
__init void e820__setup_pci_gap(void)
{
- unsigned long gap_start, gap_size;
+ unsigned long max_gap_start, max_gap_size;
int found;
- gap_size = 0x400000;
- found = e820_search_gap(&gap_start, &gap_size);
+ max_gap_size = 0x400000;
+ found = e820_search_gap(&max_gap_start, &max_gap_size);
if (!found) {
#ifdef CONFIG_X86_64
- gap_start = (max_pfn << PAGE_SHIFT) + 1024*1024;
+ max_gap_start = (max_pfn << PAGE_SHIFT) + 1024*1024;
pr_err("Cannot find an available gap in the 32-bit address range\n");
pr_err("PCI devices with unassigned 32-bit BARs may not work!\n");
#else
- gap_start = 0x10000000;
+ max_gap_start = 0x10000000;
#endif
}
/*
* e820__reserve_resources_late() protects stolen RAM already:
*/
- pci_mem_start = gap_start;
+ pci_mem_start = max_gap_start;
pr_info("[gap %#010lx-%#010lx] available for PCI devices\n",
- gap_start, gap_start + gap_size - 1);
+ max_gap_start, max_gap_start + max_gap_size - 1);
}
/*
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 23/29] x86/boot/e820: Simplify & clarify __e820__range_add() a bit
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (21 preceding siblings ...)
2025-04-21 18:52 ` [PATCH 22/29] x86/boot/e820: Rename gap_start/gap_size to max_gap_start/max_gap_start in e820_search_gap() et al Ingo Molnar
@ 2025-04-21 18:52 ` Ingo Molnar
2025-04-21 18:52 ` [PATCH 24/29] x86/boot/e820: Standardize __init/__initdata tag placement Ingo Molnar
` (6 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:52 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
Use 'entry_new' to make clear we are allocating a new entry.
Change the table-full message to say that the table is full.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index d0f9c544a14c..e5a50aadc631 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -164,16 +164,19 @@ int e820__get_entry_type(u64 start, u64 end)
static void __init __e820__range_add(struct e820_table *table, u64 start, u64 size, enum e820_type type)
{
u32 idx = table->nr_entries;
+ struct e820_entry *entry_new;
if (idx >= ARRAY_SIZE(table->entries)) {
- pr_err("too many E820 table entries; ignoring [mem %#010llx-%#010llx]\n",
+ pr_err("E820 table full; ignoring [mem %#010llx-%#010llx]\n",
start, start + size-1);
return;
}
- table->entries[idx].addr = start;
- table->entries[idx].size = size;
- table->entries[idx].type = type;
+ entry_new = table->entries + idx;
+
+ entry_new->addr = start;
+ entry_new->size = size;
+ entry_new->type = type;
table->nr_entries++;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 24/29] x86/boot/e820: Standardize __init/__initdata tag placement
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (22 preceding siblings ...)
2025-04-21 18:52 ` [PATCH 23/29] x86/boot/e820: Simplify & clarify __e820__range_add() a bit Ingo Molnar
@ 2025-04-21 18:52 ` Ingo Molnar
2025-04-21 18:52 ` [PATCH 25/29] x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables Ingo Molnar
` (5 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:52 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
So the e820.c file has a hodgepodge of __init and __initdata tag
placements:
static int __init e820_search_gap(unsigned long *max_gap_start, unsigned long *max_gap_size)
__init void e820__setup_pci_gap(void)
__init void e820__reallocate_tables(void)
void __init e820__memory_setup_extended(u64 phys_addr, u32 data_len)
void __init e820__register_nosave_regions(unsigned long limit_pfn)
static int __init e820__register_nvs_regions(void)
u64 __init e820__memblock_alloc_reserved(u64 size, u64 align)
Standardize on the style used by e820__setup_pci_gap() and place
them before the storage class.
In addition to the consistency, as a bonus this makes the grep output
rather clean looking:
__init void e820__range_remove(u64 start, u64 size, enum e820_type filter_type)
__init void e820__update_table_print(void)
__init static void e820__update_table_kexec(void)
__init static int e820_search_gap(unsigned long *max_gap_start, unsigned long *max_gap_size)
__init void e820__setup_pci_gap(void)
__init void e820__reallocate_tables(void)
__init void e820__memory_setup_extended(u64 phys_addr, u32 data_len)
__init void e820__register_nosave_regions(unsigned long limit_pfn)
__init static int e820__register_nvs_regions(void)
... and if one learns to just ignore the leftmost '__init' noise then
the rest of the line looks just like a regular C function definition.
With the 'mixed' tag placement style the __init tag breaks up the function's
prototype for no good reason.
Do the same for __initdata.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 92 +++++++++++++++++++++++++-------------------------
1 file changed, 46 insertions(+), 46 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index e5a50aadc631..ec5628a844dc 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -54,9 +54,9 @@
* re-propagated. So its main role is a temporary bootstrap storage of firmware
* specific memory layout data during early bootup.
*/
-static struct e820_table e820_table_init __initdata;
-static struct e820_table e820_table_kexec_init __initdata;
-static struct e820_table e820_table_firmware_init __initdata;
+__initdata static struct e820_table e820_table_init;
+__initdata static struct e820_table e820_table_kexec_init;
+__initdata static struct e820_table e820_table_firmware_init;
__refdata struct e820_table *e820_table = &e820_table_init;
__refdata struct e820_table *e820_table_kexec = &e820_table_kexec_init;
@@ -143,7 +143,7 @@ static struct e820_entry *__e820__mapped_all(u64 start, u64 end,
/*
* This function checks if the entire range <start,end> is mapped with type.
*/
-bool __init e820__mapped_all(u64 start, u64 end, enum e820_type type)
+__init bool e820__mapped_all(u64 start, u64 end, enum e820_type type)
{
return __e820__mapped_all(start, end, type);
}
@@ -161,7 +161,7 @@ int e820__get_entry_type(u64 start, u64 end)
/*
* Add a memory region to the kernel E820 map.
*/
-static void __init __e820__range_add(struct e820_table *table, u64 start, u64 size, enum e820_type type)
+__init static void __e820__range_add(struct e820_table *table, u64 start, u64 size, enum e820_type type)
{
u32 idx = table->nr_entries;
struct e820_entry *entry_new;
@@ -181,12 +181,12 @@ static void __init __e820__range_add(struct e820_table *table, u64 start, u64 si
table->nr_entries++;
}
-void __init e820__range_add(u64 start, u64 size, enum e820_type type)
+__init void e820__range_add(u64 start, u64 size, enum e820_type type)
{
__e820__range_add(e820_table, start, size, type);
}
-static void __init e820_print_type(enum e820_type type)
+__init static void e820_print_type(enum e820_type type)
{
switch (type) {
case E820_TYPE_RAM: pr_cont(" kernel usable RAM"); break;
@@ -236,7 +236,7 @@ static void e820_print_size(u64 size)
pr_cont(" %4llu TB", size/SZ_1T);
}
-static void __init e820__print_table(const char *who)
+__init static void e820__print_table(const char *who)
{
u64 range_end_prev = 0;
u32 idx;
@@ -343,12 +343,12 @@ struct change_member {
u64 addr;
};
-static struct change_member change_point_list[2*E820_MAX_ENTRIES] __initdata;
-static struct change_member *change_point[2*E820_MAX_ENTRIES] __initdata;
-static struct e820_entry *overlap_list[E820_MAX_ENTRIES] __initdata;
-static struct e820_entry new_entries[E820_MAX_ENTRIES] __initdata;
+__initdata static struct change_member change_point_list[2*E820_MAX_ENTRIES];
+__initdata static struct change_member *change_point[2*E820_MAX_ENTRIES];
+__initdata static struct e820_entry *overlap_list[E820_MAX_ENTRIES];
+__initdata static struct e820_entry new_entries[E820_MAX_ENTRIES];
-static int __init cpcompare(const void *a, const void *b)
+__init static int cpcompare(const void *a, const void *b)
{
struct change_member * const *app = a, * const *bpp = b;
const struct change_member *ap = *app, *bp = *bpp;
@@ -383,7 +383,7 @@ static bool e820_type_mergeable(enum e820_type type)
return true;
}
-int __init e820__update_table(struct e820_table *table)
+__init int e820__update_table(struct e820_table *table)
{
struct e820_entry *entries = table->entries;
u32 max_nr_entries = ARRAY_SIZE(table->entries);
@@ -483,7 +483,7 @@ int __init e820__update_table(struct e820_table *table)
return 0;
}
-static int __init __append_e820_table(struct boot_e820_entry *entries, u32 nr_entries)
+__init static int __append_e820_table(struct boot_e820_entry *entries, u32 nr_entries)
{
struct boot_e820_entry *entry = entries;
@@ -514,7 +514,7 @@ static int __init __append_e820_table(struct boot_e820_entry *entries, u32 nr_en
* will have given us a memory map that we can use to properly
* set up memory. If we aren't, we'll fake a memory map.
*/
-static int __init append_e820_table(struct boot_e820_entry *entries, u32 nr_entries)
+__init static int append_e820_table(struct boot_e820_entry *entries, u32 nr_entries)
{
/* Only one memory region (or negative)? Ignore it */
if (nr_entries < 2)
@@ -523,7 +523,7 @@ static int __init append_e820_table(struct boot_e820_entry *entries, u32 nr_entr
return __append_e820_table(entries, nr_entries);
}
-static u64 __init
+__init static u64
__e820__range_update(struct e820_table *table, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type)
{
u64 end;
@@ -591,19 +591,19 @@ __e820__range_update(struct e820_table *table, u64 start, u64 size, enum e820_ty
return real_updated_size;
}
-u64 __init e820__range_update(u64 start, u64 size, enum e820_type old_type, enum e820_type new_type)
+__init u64 e820__range_update(u64 start, u64 size, enum e820_type old_type, enum e820_type new_type)
{
return __e820__range_update(e820_table, start, size, old_type, new_type);
}
-u64 __init e820__range_update_table(struct e820_table *t, u64 start, u64 size,
+__init u64 e820__range_update_table(struct e820_table *t, u64 start, u64 size,
enum e820_type old_type, enum e820_type new_type)
{
return __e820__range_update(t, start, size, old_type, new_type);
}
/* Remove a range of memory from the E820 table: */
-u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type)
+__init u64 e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type)
{
u32 idx;
u64 end;
@@ -664,7 +664,7 @@ u64 __init e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool
return real_removed_size;
}
-void __init e820__update_table_print(void)
+__init void e820__update_table_print(void)
{
if (e820__update_table(e820_table))
return;
@@ -673,7 +673,7 @@ void __init e820__update_table_print(void)
e820__print_table("modified");
}
-static void __init e820__update_table_kexec(void)
+__init static void e820__update_table_kexec(void)
{
e820__update_table(e820_table_kexec);
}
@@ -683,7 +683,7 @@ static void __init e820__update_table_kexec(void)
/*
* Search for a gap in the E820 memory space from 0 to MAX_GAP_END (4GB).
*/
-static int __init e820_search_gap(unsigned long *max_gap_start, unsigned long *max_gap_size)
+__init static int e820_search_gap(unsigned long *max_gap_start, unsigned long *max_gap_size)
{
u64 last = MAX_GAP_END;
int idx = e820_table->nr_entries;
@@ -786,7 +786,7 @@ __init void e820__reallocate_tables(void)
* the remaining (if any) entries are passed via the SETUP_E820_EXT node of
* struct setup_data, which is parsed here.
*/
-void __init e820__memory_setup_extended(u64 phys_addr, u32 data_len)
+__init void e820__memory_setup_extended(u64 phys_addr, u32 data_len)
{
int entries;
struct boot_e820_entry *extmap;
@@ -815,7 +815,7 @@ void __init e820__memory_setup_extended(u64 phys_addr, u32 data_len)
* This function requires the E820 map to be sorted and without any
* overlapping entries.
*/
-void __init e820__register_nosave_regions(unsigned long limit_pfn)
+__init void e820__register_nosave_regions(unsigned long limit_pfn)
{
u32 idx;
u64 last_addr = 0;
@@ -840,7 +840,7 @@ void __init e820__register_nosave_regions(unsigned long limit_pfn)
* Register ACPI NVS memory regions, so that we can save/restore them during
* hibernation and the subsequent resume:
*/
-static int __init e820__register_nvs_regions(void)
+__init static int e820__register_nvs_regions(void)
{
u32 idx;
@@ -864,7 +864,7 @@ core_initcall(e820__register_nvs_regions);
* This allows kexec to fake a new mptable, as if it came from the real
* system.
*/
-u64 __init e820__memblock_alloc_reserved(u64 size, u64 align)
+__init u64 e820__memblock_alloc_reserved(u64 size, u64 align)
{
u64 addr;
@@ -891,7 +891,7 @@ u64 __init e820__memblock_alloc_reserved(u64 size, u64 align)
/*
* Find the highest page frame number we have available
*/
-static unsigned long __init e820__end_ram_pfn(unsigned long limit_pfn)
+__init static unsigned long e820__end_ram_pfn(unsigned long limit_pfn)
{
u32 idx;
unsigned long last_pfn = 0;
@@ -927,20 +927,20 @@ static unsigned long __init e820__end_ram_pfn(unsigned long limit_pfn)
return last_pfn;
}
-unsigned long __init e820__end_of_ram_pfn(void)
+__init unsigned long e820__end_of_ram_pfn(void)
{
return e820__end_ram_pfn(MAX_ARCH_PFN);
}
-unsigned long __init e820__end_of_low_ram_pfn(void)
+__init unsigned long e820__end_of_low_ram_pfn(void)
{
return e820__end_ram_pfn(1UL << (32 - PAGE_SHIFT));
}
-static int userdef __initdata;
+__initdata static int userdef;
/* The "mem=nopentium" boot option disables 4MB page tables on 32-bit kernels: */
-static int __init parse_memopt(char *p)
+__init static int parse_memopt(char *p)
{
u64 mem_size;
@@ -974,7 +974,7 @@ static int __init parse_memopt(char *p)
}
early_param("mem", parse_memopt);
-static int __init parse_memmap_one(char *p)
+__init static int parse_memmap_one(char *p)
{
char *oldp;
u64 start_at, mem_size;
@@ -1031,7 +1031,7 @@ static int __init parse_memmap_one(char *p)
return *p == '\0' ? 0 : -EINVAL;
}
-static int __init parse_memmap_opt(char *str)
+__init static int parse_memmap_opt(char *str)
{
while (str) {
char *k = strchr(str, ',');
@@ -1052,7 +1052,7 @@ early_param("memmap", parse_memmap_opt);
* have been processed, in which case we already have an E820 table filled in
* via the parameter callback function(s), but it's not sorted and printed yet:
*/
-void __init e820__finish_early_params(void)
+__init void e820__finish_early_params(void)
{
if (userdef) {
if (e820__update_table(e820_table) < 0)
@@ -1063,7 +1063,7 @@ void __init e820__finish_early_params(void)
}
}
-static const char *__init e820_type_to_string(struct e820_entry *entry)
+__init static const char * e820_type_to_string(struct e820_entry *entry)
{
switch (entry->type) {
case E820_TYPE_RAM: return "System RAM";
@@ -1078,7 +1078,7 @@ static const char *__init e820_type_to_string(struct e820_entry *entry)
}
}
-static unsigned long __init e820_type_to_iomem_type(struct e820_entry *entry)
+__init static unsigned long e820_type_to_iomem_type(struct e820_entry *entry)
{
switch (entry->type) {
case E820_TYPE_RAM: return IORESOURCE_SYSTEM_RAM;
@@ -1093,7 +1093,7 @@ static unsigned long __init e820_type_to_iomem_type(struct e820_entry *entry)
}
}
-static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
+__init static unsigned long e820_type_to_iores_desc(struct e820_entry *entry)
{
switch (entry->type) {
case E820_TYPE_ACPI: return IORES_DESC_ACPI_TABLES;
@@ -1111,13 +1111,13 @@ static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
/*
* We assign one resource entry for each E820 map entry:
*/
-static struct resource __initdata *e820_res;
+__initdata static struct resource *e820_res;
/*
* Is this a device address region that should not be marked busy?
* (Versus system address regions that we register & lock early.)
*/
-static bool __init e820_device_region(enum e820_type type, struct resource *res)
+__init static bool e820_device_region(enum e820_type type, struct resource *res)
{
/* This is the legacy BIOS/DOS ROM-shadow + MMIO region: */
if (res->start < (1ULL<<20))
@@ -1146,7 +1146,7 @@ static bool __init e820_device_region(enum e820_type type, struct resource *res)
/*
* Mark E820 system regions as busy for the resource manager:
*/
-void __init e820__reserve_resources(void)
+__init void e820__reserve_resources(void)
{
u32 idx;
struct resource *res;
@@ -1193,7 +1193,7 @@ void __init e820__reserve_resources(void)
/*
* How much should we pad the end of RAM, depending on where it is?
*/
-static unsigned long __init ram_alignment(resource_size_t pos)
+__init static unsigned long ram_alignment(resource_size_t pos)
{
unsigned long mb = pos >> 20;
@@ -1211,7 +1211,7 @@ static unsigned long __init ram_alignment(resource_size_t pos)
#define MAX_RESOURCE_SIZE ((resource_size_t)-1)
-void __init e820__reserve_resources_late(void)
+__init void e820__reserve_resources_late(void)
{
u32 idx;
struct resource *res;
@@ -1261,7 +1261,7 @@ void __init e820__reserve_resources_late(void)
/*
* Pass the firmware (bootloader) E820 map to the kernel and process it:
*/
-char *__init e820__memory_setup_default(void)
+__init char * e820__memory_setup_default(void)
{
char *who = "BIOS-e820";
@@ -1299,7 +1299,7 @@ char *__init e820__memory_setup_default(void)
* E820 map - with an optional platform quirk available for virtual platforms
* to override this method of boot environment processing:
*/
-void __init e820__memory_setup(void)
+__init void e820__memory_setup(void)
{
char *who;
@@ -1315,7 +1315,7 @@ void __init e820__memory_setup(void)
e820__print_table(who);
}
-void __init e820__memblock_setup(void)
+__init void e820__memblock_setup(void)
{
u32 idx;
u64 end;
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 25/29] x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (23 preceding siblings ...)
2025-04-21 18:52 ` [PATCH 24/29] x86/boot/e820: Standardize __init/__initdata tag placement Ingo Molnar
@ 2025-04-21 18:52 ` Ingo Molnar
2025-04-21 18:52 ` [PATCH 26/29] x86/boot/e820: Remove e820__range_remove()'s unused return parameter Ingo Molnar
` (4 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:52 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
So append_e820_table() begins with this weird condition that checks 'nr_entries':
static int __init append_e820_table(struct boot_e820_entry *entries, u32 nr_entries)
{
/* Only one memory region (or negative)? Ignore it */
if (nr_entries < 2)
return -1;
Firstly, 'nr_entries' has been an u32 since 2017 and cannot be negative.
Secondly, there's nothing inherently wrong with single-entry E820 maps,
especially in virtualized environments.
So remove this restriction and remove the __append_e820_table()
indirection.
Also:
- fix/update comments
- remove obsolete comments
This shrinks the generated code a bit as well:
text data bss dec hex filename
7549 44072 0 51621 c9a5 e820.o.before
7533 44072 0 51605 c995 e820.o.after
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 35 +++++++++++------------------------
1 file changed, 11 insertions(+), 24 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index ec5628a844dc..27bc45a65807 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -483,17 +483,22 @@ __init int e820__update_table(struct e820_table *table)
return 0;
}
-__init static int __append_e820_table(struct boot_e820_entry *entries, u32 nr_entries)
+/*
+ * Copy the BIOS E820 map into the kernel's e820_table.
+ *
+ * Sanity-check it while we're at it..
+ */
+__init static int append_e820_table(struct boot_e820_entry *entries, u32 nr_entries)
{
struct boot_e820_entry *entry = entries;
while (nr_entries) {
u64 start = entry->addr;
- u64 size = entry->size;
- u64 end = start + size - 1;
- u32 type = entry->type;
+ u64 size = entry->size;
+ u64 end = start + size-1;
+ u32 type = entry->type;
- /* Ignore the entry on 64-bit overflow: */
+ /* Ignore the remaining entries on 64-bit overflow: */
if (start > end && likely(size))
return -1;
@@ -505,24 +510,6 @@ __init static int __append_e820_table(struct boot_e820_entry *entries, u32 nr_en
return 0;
}
-/*
- * Copy the BIOS E820 map into a safe place.
- *
- * Sanity-check it while we're at it..
- *
- * If we're lucky and live on a modern system, the setup code
- * will have given us a memory map that we can use to properly
- * set up memory. If we aren't, we'll fake a memory map.
- */
-__init static int append_e820_table(struct boot_e820_entry *entries, u32 nr_entries)
-{
- /* Only one memory region (or negative)? Ignore it */
- if (nr_entries < 2)
- return -1;
-
- return __append_e820_table(entries, nr_entries);
-}
-
__init static u64
__e820__range_update(struct e820_table *table, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type)
{
@@ -796,7 +783,7 @@ __init void e820__memory_setup_extended(u64 phys_addr, u32 data_len)
entries = sdata->len / sizeof(*extmap);
extmap = (struct boot_e820_entry *)(sdata->data);
- __append_e820_table(extmap, entries);
+ append_e820_table(extmap, entries);
e820__update_table(e820_table);
memcpy(e820_table_kexec, e820_table, sizeof(*e820_table_kexec));
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 26/29] x86/boot/e820: Remove e820__range_remove()'s unused return parameter
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (24 preceding siblings ...)
2025-04-21 18:52 ` [PATCH 25/29] x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables Ingo Molnar
@ 2025-04-21 18:52 ` Ingo Molnar
2025-04-21 18:52 ` [PATCH 27/29] x86/boot/e820: Simplify the e820__range_remove() API Ingo Molnar
` (3 subsequent siblings)
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:52 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
None of the usage sites make use of the 'real_removed_size'
return parameter of e820__range_remove(), and it's hard
to contemplate much constructive use: E820 maps can have
holes, and removing a fixed range may result in removal
of any number of bytes from 0 to the requested size.
So remove this pointless calculation. This simplifies
the function a bit:
text data bss dec hex filename
7645 44072 0 51717 ca05 e820.o.before
7597 44072 0 51669 c9d5 e820.o.after
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/include/asm/e820/api.h | 2 +-
arch/x86/kernel/e820.c | 8 +-------
2 files changed, 2 insertions(+), 8 deletions(-)
diff --git a/arch/x86/include/asm/e820/api.h b/arch/x86/include/asm/e820/api.h
index 54427b77bc19..9cf416f7a84f 100644
--- a/arch/x86/include/asm/e820/api.h
+++ b/arch/x86/include/asm/e820/api.h
@@ -16,7 +16,7 @@ extern bool e820__mapped_all(u64 start, u64 end, enum e820_type type);
extern void e820__range_add (u64 start, u64 size, enum e820_type type);
extern u64 e820__range_update(u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
-extern u64 e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
+extern void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
extern u64 e820__range_update_table(struct e820_table *t, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
extern int e820__update_table(struct e820_table *table);
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 27bc45a65807..2f9aecb9911c 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -590,11 +590,10 @@ __init u64 e820__range_update_table(struct e820_table *t, u64 start, u64 size,
}
/* Remove a range of memory from the E820 table: */
-__init u64 e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type)
+__init void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type)
{
u32 idx;
u64 end;
- u64 real_removed_size = 0;
if (size > (ULLONG_MAX - start))
size = ULLONG_MAX - start;
@@ -617,7 +616,6 @@ __init u64 e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool
/* Completely covered? */
if (entry->addr >= start && entry_end <= end) {
- real_removed_size += entry->size;
memset(entry, 0, sizeof(*entry));
continue;
}
@@ -626,7 +624,6 @@ __init u64 e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool
if (entry->addr < start && entry_end > end) {
e820__range_add(end, entry_end - end, entry->type);
entry->size = start - entry->addr;
- real_removed_size += size;
continue;
}
@@ -636,8 +633,6 @@ __init u64 e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool
if (final_start >= final_end)
continue;
- real_removed_size += final_end - final_start;
-
/*
* Left range could be head or tail, so need to update
* the size first:
@@ -648,7 +643,6 @@ __init u64 e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool
entry->addr = final_end;
}
- return real_removed_size;
}
__init void e820__update_table_print(void)
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 27/29] x86/boot/e820: Simplify the e820__range_remove() API
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (25 preceding siblings ...)
2025-04-21 18:52 ` [PATCH 26/29] x86/boot/e820: Remove e820__range_remove()'s unused return parameter Ingo Molnar
@ 2025-04-21 18:52 ` Ingo Molnar
2025-04-22 16:41 ` Andy Shevchenko
2025-04-21 18:52 ` [PATCH 28/29] x86/boot/e820: Make sure e820_search_gap() finds all gaps Ingo Molnar
` (2 subsequent siblings)
29 siblings, 1 reply; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:52 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
Right now e820__range_remove() has two parameters to control the
E820 type of the range removed:
extern void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
Since E820 types start at 1, zero has a natural meaning of 'no type.
Consolidate the (old_type,check_type) parameters into a single (filter_type)
parameter:
extern void e820__range_remove(u64 start, u64 size, enum e820_type filter_type);
Note that both e820__mapped_raw_any() and e820__mapped_any()
already have such semantics for their 'type' parameter, although
it's currently not used with '0' by in-kernel code.
Also, the __e820__mapped_all() internal helper already has such
semantics implemented as well, and the e820__get_entry_type() API
uses the '0' type to such effect.
This simplifies not just e820__range_remove(), and synchronizes its
use of type filters with other E820 API functions, but simplifies
usage sites as well, such as parse_memmap_one(), beyond the reduction
of the number of parameters:
- else if (from)
- e820__range_remove(start_at, mem_size, from, 1);
else
- e820__range_remove(start_at, mem_size, 0, 0);
+ e820__range_remove(start_at, mem_size, from);
The generated code gets smaller as well:
add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-66 (-66)
Function old new delta
parse_memopt 112 107 -5
efi_init 1048 1039 -9
setup_arch 2719 2709 -10
e820__range_remove 283 273 -10
parse_memmap_opt 559 527 -32
Total: Before=22,675,600, After=22,675,534, chg -0.00%
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/include/asm/e820/api.h | 2 +-
arch/x86/kernel/e820.c | 16 +++++++---------
arch/x86/kernel/setup.c | 4 ++--
arch/x86/platform/efi/efi.c | 3 +--
4 files changed, 11 insertions(+), 14 deletions(-)
diff --git a/arch/x86/include/asm/e820/api.h b/arch/x86/include/asm/e820/api.h
index 9cf416f7a84f..bbe0c8de976c 100644
--- a/arch/x86/include/asm/e820/api.h
+++ b/arch/x86/include/asm/e820/api.h
@@ -16,7 +16,7 @@ extern bool e820__mapped_all(u64 start, u64 end, enum e820_type type);
extern void e820__range_add (u64 start, u64 size, enum e820_type type);
extern u64 e820__range_update(u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
-extern void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
+extern void e820__range_remove(u64 start, u64 size, enum e820_type filter_type);
extern u64 e820__range_update_table(struct e820_table *t, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
extern int e820__update_table(struct e820_table *table);
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 2f9aecb9911c..b4fe9bebdeb5 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -590,7 +590,7 @@ __init u64 e820__range_update_table(struct e820_table *t, u64 start, u64 size,
}
/* Remove a range of memory from the E820 table: */
-__init void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type)
+__init void e820__range_remove(u64 start, u64 size, enum e820_type filter_type)
{
u32 idx;
u64 end;
@@ -600,8 +600,8 @@ __init void e820__range_remove(u64 start, u64 size, enum e820_type old_type, boo
end = start + size;
printk(KERN_DEBUG "e820: remove [mem %#010Lx-%#010Lx]", start, end - 1);
- if (check_type)
- e820_print_type(old_type);
+ if (filter_type)
+ e820_print_type(filter_type);
pr_cont("\n");
for (idx = 0; idx < e820_table->nr_entries; idx++) {
@@ -609,7 +609,7 @@ __init void e820__range_remove(u64 start, u64 size, enum e820_type old_type, boo
u64 final_start, final_end;
u64 entry_end;
- if (check_type && entry->type != old_type)
+ if (filter_type && entry->type != filter_type)
continue;
entry_end = entry->addr + entry->size;
@@ -945,7 +945,7 @@ __init static int parse_memopt(char *p)
if (mem_size == 0)
return -EINVAL;
- e820__range_remove(mem_size, ULLONG_MAX - mem_size, E820_TYPE_RAM, 1);
+ e820__range_remove(mem_size, ULLONG_MAX - mem_size, E820_TYPE_RAM);
#ifdef CONFIG_MEMORY_HOTPLUG
max_mem_size = mem_size;
@@ -1001,12 +1001,10 @@ __init static int parse_memmap_one(char *p)
e820__range_update(start_at, mem_size, from, to);
else if (to)
e820__range_add(start_at, mem_size, to);
- else if (from)
- e820__range_remove(start_at, mem_size, from, 1);
else
- e820__range_remove(start_at, mem_size, 0, 0);
+ e820__range_remove(start_at, mem_size, from);
} else {
- e820__range_remove(mem_size, ULLONG_MAX - mem_size, E820_TYPE_RAM, 1);
+ e820__range_remove(mem_size, ULLONG_MAX - mem_size, E820_TYPE_RAM);
}
return *p == '\0' ? 0 : -EINVAL;
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 83cc265140c0..0f4d78939005 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -736,7 +736,7 @@ static void __init trim_bios_range(void)
* area (640Kb -> 1Mb) as RAM even though it is not.
* take them out.
*/
- e820__range_remove(BIOS_BEGIN, BIOS_END - BIOS_BEGIN, E820_TYPE_RAM, 1);
+ e820__range_remove(BIOS_BEGIN, BIOS_END - BIOS_BEGIN, E820_TYPE_RAM);
e820__update_table(e820_table);
}
@@ -758,7 +758,7 @@ static void __init e820_add_kernel_range(void)
return;
pr_warn(".text .data .bss are not marked as E820_TYPE_RAM!\n");
- e820__range_remove(start, size, E820_TYPE_RAM, 0);
+ e820__range_remove(start, size, 0);
e820__range_add(start, size, E820_TYPE_RAM);
}
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 463b784499a8..d00c6de7f3b7 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -333,8 +333,7 @@ static void __init efi_remove_e820_mmio(void)
if (size >= 256*1024) {
pr_info("Remove mem%02u: MMIO range=[0x%08llx-0x%08llx] (%lluMB) from e820 map\n",
i, start, end, size >> 20);
- e820__range_remove(start, size,
- E820_TYPE_RESERVED, 1);
+ e820__range_remove(start, size, E820_TYPE_RESERVED);
} else {
pr_info("Not removing mem%02u: MMIO range=[0x%08llx-0x%08llx] (%lluKB) from e820 map\n",
i, start, end, size >> 10);
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 28/29] x86/boot/e820: Make sure e820_search_gap() finds all gaps
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (26 preceding siblings ...)
2025-04-21 18:52 ` [PATCH 27/29] x86/boot/e820: Simplify the e820__range_remove() API Ingo Molnar
@ 2025-04-21 18:52 ` Ingo Molnar
2025-04-21 18:52 ` [PATCH 29/29] x86/boot/e820: Treat non-type-2 'reserved' E820 region types as E820_TYPE_RESERVED Ingo Molnar
2025-04-22 6:58 ` [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Arnd Bergmann
29 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:52 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
The current implementation of e820_search_gap() searches gaps
in a reverse search from MAX_GAP_END back to 0, contrary to
what its main comment claims:
* Search for a gap in the E820 memory space from 0 to MAX_GAP_END (4GB).
But gaps can not only be beyond E820 RAM ranges, they can be below
them as well. For example this function will not find the proper
PCI gap for simplified memory map layouts that have a single RAM
range that crosses the 4GB boundary.
Rework the function to have a proper forward search of
E820 table entries.
This makes the code somewhat bigger:
text data bss dec hex filename
7613 44072 0 51685 c9e5 e820.o.before
7645 44072 0 51717 ca05 e820.o.after
but it now both implements what it claims to do, and is more
straightforward to read.
( This also allows 'idx' to be the regular u32 again, not an 'int'
underflowing to -1. )
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 59 +++++++++++++++++++++++++++++++++++---------------
1 file changed, 41 insertions(+), 18 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index b4fe9bebdeb5..f9caf4c922ea 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -666,30 +666,52 @@ __init static void e820__update_table_kexec(void)
*/
__init static int e820_search_gap(unsigned long *max_gap_start, unsigned long *max_gap_size)
{
- u64 last = MAX_GAP_END;
- int idx = e820_table->nr_entries;
+ struct e820_entry *entry;
+ u64 range_end_prev = 0;
int found = 0;
+ u32 idx;
- while (--idx >= 0) {
- u64 start = e820_table->entries[idx].addr;
- u64 end = start + e820_table->entries[idx].size;
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
+ u64 range_start, range_end;
- /*
- * Since "last" is at most 4GB, we know we'll
- * fit in 32 bits if this condition is true:
- */
- if (last > end) {
- unsigned long gap = last - end;
+ entry = e820_table->entries + idx;
+ range_start = entry->addr;
+ range_end = entry->addr + entry->size;
- if (gap > *max_gap_size) {
- *max_gap_size = gap;
- *max_gap_start = end;
- found = 1;
+ /* Process any gap before this entry: */
+ if (range_start > range_end_prev) {
+ u64 gap_start = range_end_prev;
+ u64 gap_end = range_start;
+ u64 gap_size;
+
+ if (gap_start < MAX_GAP_END) {
+ /* Make sure the entirety of the gap is below MAX_GAP_END: */
+ gap_end = min(gap_end, MAX_GAP_END);
+ gap_size = gap_end-gap_start;
+
+ if (gap_size >= *max_gap_size) {
+ *max_gap_start = gap_start;
+ *max_gap_size = gap_size;
+ found = 1;
+ }
}
}
- if (start < last)
- last = start;
+
+ range_end_prev = range_end;
+ }
+
+ /* Is there a usable gap beyond the last entry: */
+ if (entry->addr + entry->size < MAX_GAP_END) {
+ u64 gap_start = entry->addr + entry->size;
+ u64 gap_size = MAX_GAP_END-gap_start;
+
+ if (gap_size >= *max_gap_size) {
+ *max_gap_start = gap_start;
+ *max_gap_size = gap_size;
+ found = 1;
+ }
}
+
return found;
}
@@ -706,6 +728,7 @@ __init void e820__setup_pci_gap(void)
unsigned long max_gap_start, max_gap_size;
int found;
+ /* The minimum eligible gap size is 4MB: */
max_gap_size = 0x400000;
found = e820_search_gap(&max_gap_start, &max_gap_size);
@@ -725,7 +748,7 @@ __init void e820__setup_pci_gap(void)
pci_mem_start = max_gap_start;
pr_info("[gap %#010lx-%#010lx] available for PCI devices\n",
- max_gap_start, max_gap_start + max_gap_size - 1);
+ max_gap_start, max_gap_start + max_gap_size-1);
}
/*
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 29/29] x86/boot/e820: Treat non-type-2 'reserved' E820 region types as E820_TYPE_RESERVED
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (27 preceding siblings ...)
2025-04-21 18:52 ` [PATCH 28/29] x86/boot/e820: Make sure e820_search_gap() finds all gaps Ingo Molnar
@ 2025-04-21 18:52 ` Ingo Molnar
2025-04-25 3:55 ` H. Peter Anvin
2025-04-22 6:58 ` [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Arnd Bergmann
29 siblings, 1 reply; 68+ messages in thread
From: Ingo Molnar @ 2025-04-21 18:52 UTC (permalink / raw)
To: linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, Ingo Molnar,
David Woodhouse
Paul Menzel pointed out that ACPI specification 6.3 defines 'reserved'
E820 region types as E820_TYPE_RESERVED (type 2):
> Table 15-374 *Address Range Types* in the ACPI specification 6.3 says:
>
> > Reserved for future use. OSPM must treat any range of this type as if
> > the type returned was AddressRangeReserved.
This has relevance for device address regions, which on some firmware such
as CoreBoot, get passed to Linux as type-13 - which the kernel
treats as system regions and registers them as unavailable to drivers:
static bool __init e820_device_region(enum e820_type type, struct resource *res)
...
case E820_TYPE_ACPI:
case E820_TYPE_NVS:
case E820_TYPE_UNUSABLE:
default:
return false;
Users of such systems will see device breakage on Linux, which they
have to work around with iomem=relaxed kind of boot time hacks to
turn off resource conflict checking.
Follow the ACPI spec and change the Linux E820 code to treat unknown/reserved
E820 region types as E820_TYPE_RESERVED device memory regions.
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index f9caf4c922ea..c36261d78109 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1140,8 +1140,10 @@ __init static bool e820_device_region(enum e820_type type, struct resource *res)
case E820_TYPE_ACPI:
case E820_TYPE_NVS:
case E820_TYPE_UNUSABLE:
- default:
return false;
+ /* Reserved E820 types should be treated as device memory regions: */
+ default:
+ return true;
}
}
--
2.45.2
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PATCH 02/29] x86/boot/e820: Simplify e820__print_table() a bit
2025-04-21 18:51 ` [PATCH 02/29] x86/boot/e820: Simplify e820__print_table() a bit Ingo Molnar
@ 2025-04-22 5:56 ` Mike Rapoport
2025-05-15 10:15 ` Ingo Molnar
0 siblings, 1 reply; 68+ messages in thread
From: Mike Rapoport @ 2025-04-22 5:56 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
On Mon, Apr 21, 2025 at 08:51:42PM +0200, Ingo Molnar wrote:
> Introduce 'entry' for the current table entry and shorten
> repetitious use of e820_table->entries[i].
>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Cc: Andy Shevchenko <andy@kernel.org>
> Cc: Arnd Bergmann <arnd@kernel.org>
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---
> arch/x86/kernel/e820.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index 4a81f9e94137..b1a30bca56cd 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -204,12 +204,14 @@ void __init e820__print_table(char *who)
> int i;
>
> for (i = 0; i < e820_table->nr_entries; i++) {
> + struct e820_entry *entry = e820_table->entries + i;
> +
> pr_info("%s: [mem %#018Lx-%#018Lx] ",
> who,
> - e820_table->entries[i].addr,
> - e820_table->entries[i].addr + e820_table->entries[i].size - 1);
> + entry->addr,
> + entry->addr + entry->size-1);
nit: entry->size - 1
>
> - e820_print_type(e820_table->entries[i].type);
> + e820_print_type(entry->type);
> pr_cont("\n");
> }
> }
> --
> 2.45.2
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table
2025-04-21 18:51 ` [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table Ingo Molnar
@ 2025-04-22 6:00 ` Mike Rapoport
2025-04-22 6:26 ` Andy Shevchenko
2025-04-22 6:42 ` Andy Shevchenko
1 sibling, 1 reply; 68+ messages in thread
From: Mike Rapoport @ 2025-04-22 6:00 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
On Mon, Apr 21, 2025 at 08:51:45PM +0200, Ingo Molnar wrote:
> Gaps in the E820 table are not obvious at a glance and can
> easily be overlooked.
>
> Print out gaps in the E820 table:
>
> Before:
>
> BIOS-provided physical RAM map:
> BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
> BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
> BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
> After:
>
> BIOS-provided physical RAM map:
> BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> BIOS-e820: [gap 0x00000000000a0000-0x00000000000effff]
> BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
> BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
> BIOS-e820: [gap 0x0000000080000000-0x00000000afffffff]
> BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> BIOS-e820: [gap 0x00000000c0000000-0x00000000fed1bfff]
> BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> BIOS-e820: [gap 0x00000000fed20000-0x00000000feffbfff]
> BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> BIOS-e820: [gap 0x00000000ff000000-0x00000000fffbffff]
> BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> BIOS-e820: [gap 0x0000000100000000-0x000000fcffffffff]
> BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
>
> Also warn about badly ordered E820 table entries:
>
> BUG: out of order E820 entry!
>
> ( this is printed before the entry is printed, so there's no need to
> print any additional data with the warning. )
>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Cc: Andy Shevchenko <andy@kernel.org>
> Cc: Arnd Bergmann <arnd@kernel.org>
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---
> arch/x86/kernel/e820.c | 22 ++++++++++++++++++----
> 1 file changed, 18 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index 4244b6d53fd0..a8edfa375fbd 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -201,18 +201,32 @@ static void __init e820_print_type(enum e820_type type)
>
> static void __init e820__print_table(const char *who)
> {
> + u64 range_end_prev = 0;
> int i;
>
> for (i = 0; i < e820_table->nr_entries; i++) {
> struct e820_entry *entry = e820_table->entries + i;
> + u64 range_start, range_end;
>
> - pr_info("%s: [mem %#018Lx-%#018Lx] ",
> - who,
> - entry->addr,
> - entry->addr + entry->size-1);
> + range_start = entry->addr;
> + range_end = entry->addr + entry->size;
>
> + /* Out of order E820 maps should not happen: */
> + if (range_start < range_end_prev)
> + pr_info("BUG: out of order E820 entry!\n");
Maybe "BIOS BUG"?
> +
> + if (range_start > range_end_prev) {
> + pr_info("%s: [gap %#018Lx-%#018Lx]\n",
> + who,
> + range_end_prev,
> + range_start-1);
range_start - 1 ;-)
> + }
> +
> + pr_info("%s: [mem %#018Lx-%#018Lx] ", who, range_start, range_end-1);
> e820_print_type(entry->type);
> pr_cont("\n");
> +
> + range_end_prev = range_end;
> }
> }
>
> --
> 2.45.2
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table
2025-04-22 6:00 ` Mike Rapoport
@ 2025-04-22 6:26 ` Andy Shevchenko
2025-05-15 10:23 ` Ingo Molnar
0 siblings, 1 reply; 68+ messages in thread
From: Andy Shevchenko @ 2025-04-22 6:26 UTC (permalink / raw)
To: Mike Rapoport
Cc: Ingo Molnar, linux-kernel, Andy Shevchenko, Arnd Bergmann,
Borislav Petkov, Juergen Gross, H . Peter Anvin, Kees Cook,
Linus Torvalds, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
Tue, Apr 22, 2025 at 09:00:00AM +0300, Mike Rapoport kirjoitti:
> On Mon, Apr 21, 2025 at 08:51:45PM +0200, Ingo Molnar wrote:
...
> > + if (range_start < range_end_prev)
> > + pr_info("BUG: out of order E820 entry!\n");
>
> Maybe "BIOS BUG"?
FWIW, we have FW_BUG definition, with that
pr_info(FW_BUG "out of order E820 entry!\n");
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 03/29] x86/boot/e820: Simplify the PPro Erratum #50 workaround
2025-04-21 18:51 ` [PATCH 03/29] x86/boot/e820: Simplify the PPro Erratum #50 workaround Ingo Molnar
@ 2025-04-22 6:29 ` Andy Shevchenko
2025-05-15 10:20 ` Ingo Molnar
0 siblings, 1 reply; 68+ messages in thread
From: Andy Shevchenko @ 2025-04-22 6:29 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Mike Rapoport, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
Mon, Apr 21, 2025 at 08:51:43PM +0200, Ingo Molnar kirjoitti:
> No need to print out the table - users won't really be able
> to tell much from it anyway and the messages around this
> erratum are unnecessarily obtuse.
>
> Instead clearly inform the user that a 256 kB hole is being
> punched in their memory map at the 1.75 GB physical address.
>
> Not that there are many PPro users left. :-)
...
> + e820__range_update(0x70000000ULL, 0x40000ULL, E820_TYPE_RAM, E820_TYPE_RESERVED);
Perhaps, 0x40000ULL --> SZ_256K ?
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 08/29] x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries
2025-04-21 18:51 ` [PATCH 08/29] x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries Ingo Molnar
@ 2025-04-22 6:31 ` Andy Shevchenko
2025-04-22 6:43 ` Mike Rapoport
2025-04-22 7:06 ` Mike Rapoport
1 sibling, 1 reply; 68+ messages in thread
From: Andy Shevchenko @ 2025-04-22 6:31 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Mike Rapoport, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
Mon, Apr 21, 2025 at 08:51:48PM +0200, Ingo Molnar kirjoitti:
> So it is a bit weird that the actual RAM entries of the E820 table
> are not actually called RAM, but 'usable':
>
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB usable
>
> 'usable' is pretty passive-aggressive in that context and ambiguous,
> most E820 entries denote 'usable' address ranges - reserved ranges
> may be used by devices, or the platform.
>
> Clarify and disambiguate this by making the boot log entry
> explicitly say 'kernel usable RAM':
>
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
Can't user space use that RAM?
Shouldn't we rather refer to "OS usable RAM"?
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges
2025-04-21 18:51 ` [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges Ingo Molnar
@ 2025-04-22 6:38 ` Andy Shevchenko
2025-05-15 10:44 ` Ingo Molnar
2025-04-22 6:53 ` Mike Rapoport
1 sibling, 1 reply; 68+ messages in thread
From: Andy Shevchenko @ 2025-04-22 6:38 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Mike Rapoport, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
Mon, Apr 21, 2025 at 08:51:47PM +0200, Ingo Molnar kirjoitti:
> Before:
>
> BIOS-provided physical RAM map:
> BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
> BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
> BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
>
> After:
>
> BIOS-provided physical RAM map:
> BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] 639 KB kernel usable RAM
> BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] 1 KB reserved
> BIOS-e820: [gap 0x00000000000a0000-0x00000000000effff] 320 KB ...
> BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] 64 KB reserved
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
> BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] 144 KB reserved
> BIOS-e820: [gap 0x0000000080000000-0x00000000afffffff] 768 MB ...
> BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] 256 MB reserved
> BIOS-e820: [gap 0x00000000c0000000-0x00000000fed1bfff] 1005.1 MB ...
> BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] 16 KB reserved
> BIOS-e820: [gap 0x00000000fed20000-0x00000000feffbfff] 2.8 MB ...
> BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] 16 KB reserved
> BIOS-e820: [gap 0x00000000ff000000-0x00000000fffbffff] 15.7 MB ...
> BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] 256 KB reserved
> BIOS-e820: [gap 0x0000000100000000-0x000000fcffffffff] 1008 GB ...
> BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] 12 GB reserved
>
> Note how a 1-digit precision field is printed out if a range is
> fractional in its largest-enclosing natural size unit.
>
> So the "256 MB" and "12 GB" fields above denote exactly 256 MB and
> 12 GB regions, while "1.9 GB" signals the region's fractional nature
> and it being just below 2GB.
>
> Printing E820 maps with such details visualizes 'weird' ranges
> at a glance, and gives users a better understanding of how
> large the various ranges are, without having to perform hexadecimal
> subtraction in their minds.
...
> +/*
> + * Print out the size of a E820 region, in human-readable
> + * fashion, going from KB, MB, GB to TB units.
> + *
> + * Print out fractional sizes with a single digit of precision.
> + */
> +static void e820_print_size(u64 size)
> +{
> + if (size < SZ_1M) {
> + if (size & (SZ_1K-1))
> + pr_cont(" %4llu.%01llu KB", size/SZ_1K, 10*(size & (SZ_1K-1))/SZ_1K);
> + else
> + pr_cont(" %4llu KB", size/SZ_1K);
I would add some spaces here and there for the sake of readability.
> + return;
> + }
> + if (size < SZ_1G) {
Can be written in one line as
} else if (...) {
Ditto for the rest.
> + if (size & (SZ_1M-1))
> + pr_cont(" %4llu.%01llu MB", size/SZ_1M, 10*(size & (SZ_1M-1))/SZ_1M);
> + else
> + pr_cont(" %4llu MB", size/SZ_1M);
> + return;
> + }
> + if (size < SZ_1T) {
> + if (size & (SZ_1G-1))
> + pr_cont(" %4llu.%01llu GB", size/SZ_1G, 10*(size & (SZ_1G-1))/SZ_1G);
> + else
> + pr_cont(" %4llu GB", size/SZ_1G);
> + return;
> + }
> + if (size & (SZ_1T-1))
> + pr_cont(" %4llu.%01llu TB", size/SZ_1T, 10*(size & (SZ_1T-1))/SZ_1T);
> + else
> + pr_cont(" %4llu TB", size/SZ_1T);
> +}
Don't you want to use string_helpers.h provided API?
string_get_size().
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table
2025-04-21 18:51 ` [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table Ingo Molnar
2025-04-22 6:00 ` Mike Rapoport
@ 2025-04-22 6:42 ` Andy Shevchenko
2025-05-15 10:28 ` Ingo Molnar
1 sibling, 1 reply; 68+ messages in thread
From: Andy Shevchenko @ 2025-04-22 6:42 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Mike Rapoport, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
Mon, Apr 21, 2025 at 08:51:45PM +0200, Ingo Molnar kirjoitti:
> Gaps in the E820 table are not obvious at a glance and can
> easily be overlooked.
>
> Print out gaps in the E820 table:
>
> Before:
>
> BIOS-provided physical RAM map:
> BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
> BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
> BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
> After:
>
> BIOS-provided physical RAM map:
> BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> BIOS-e820: [gap 0x00000000000a0000-0x00000000000effff]
> BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
> BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
> BIOS-e820: [gap 0x0000000080000000-0x00000000afffffff]
> BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> BIOS-e820: [gap 0x00000000c0000000-0x00000000fed1bfff]
> BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> BIOS-e820: [gap 0x00000000fed20000-0x00000000feffbfff]
> BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> BIOS-e820: [gap 0x00000000ff000000-0x00000000fffbffff]
> BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> BIOS-e820: [gap 0x0000000100000000-0x000000fcffffffff]
> BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
>
> Also warn about badly ordered E820 table entries:
>
> BUG: out of order E820 entry!
>
> ( this is printed before the entry is printed, so there's no need to
> print any additional data with the warning. )
...
> + u64 range_start, range_end;
struct range (from range.h) and...
> + range_start = entry->addr;
> + range_end = entry->addr + entry->size;
>
> + /* Out of order E820 maps should not happen: */
> + if (range_start < range_end_prev)
> + pr_info("BUG: out of order E820 entry!\n");
> +
> + if (range_start > range_end_prev) {
> + pr_info("%s: [gap %#018Lx-%#018Lx]\n",
> + who,
> + range_end_prev,
> + range_start-1);
%pra
with who mentioned the "gap"?
> + }
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 08/29] x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries
2025-04-22 6:31 ` Andy Shevchenko
@ 2025-04-22 6:43 ` Mike Rapoport
2025-05-15 11:04 ` Ingo Molnar
0 siblings, 1 reply; 68+ messages in thread
From: Mike Rapoport @ 2025-04-22 6:43 UTC (permalink / raw)
To: Andy Shevchenko
Cc: Ingo Molnar, linux-kernel, Andy Shevchenko, Arnd Bergmann,
Borislav Petkov, Juergen Gross, H . Peter Anvin, Kees Cook,
Linus Torvalds, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
On Tue, Apr 22, 2025 at 09:31:31AM +0300, Andy Shevchenko wrote:
> Mon, Apr 21, 2025 at 08:51:48PM +0200, Ingo Molnar kirjoitti:
> > So it is a bit weird that the actual RAM entries of the E820 table
> > are not actually called RAM, but 'usable':
> >
> > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB usable
> >
> > 'usable' is pretty passive-aggressive in that context and ambiguous,
> > most E820 entries denote 'usable' address ranges - reserved ranges
> > may be used by devices, or the platform.
> >
> > Clarify and disambiguate this by making the boot log entry
> > explicitly say 'kernel usable RAM':
> >
> > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
>
> Can't user space use that RAM?
>
> Shouldn't we rather refer to "OS usable RAM"?
Or "System RAM", just like in /proc/iomem
> --
> With Best Regards,
> Andy Shevchenko
>
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 10/29] x86/boot/e820: Use 'u64' consistently instead of 'unsigned long long'
2025-04-21 18:51 ` [PATCH 10/29] x86/boot/e820: Use 'u64' consistently instead of 'unsigned long long' Ingo Molnar
@ 2025-04-22 6:44 ` Andy Shevchenko
2025-05-15 11:20 ` Ingo Molnar
0 siblings, 1 reply; 68+ messages in thread
From: Andy Shevchenko @ 2025-04-22 6:44 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Mike Rapoport, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
Mon, Apr 21, 2025 at 08:51:50PM +0200, Ingo Molnar kirjoitti:
> There's a number of structure fields and local variables related
> to E820 entry physical addresses that are defined as 'unsigned long long',
> but then are compared to u64 fields.
>
> Make the types all consistently u64.
...
> + u64 start = e820_table->entries[i].addr;
> + u64 end = start + e820_table->entries[i].size;
Perhaps struct range as well?
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges
2025-04-21 18:51 ` [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges Ingo Molnar
2025-04-22 6:38 ` Andy Shevchenko
@ 2025-04-22 6:53 ` Mike Rapoport
2025-05-15 11:00 ` Ingo Molnar
1 sibling, 1 reply; 68+ messages in thread
From: Mike Rapoport @ 2025-04-22 6:53 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
On Mon, Apr 21, 2025 at 08:51:47PM +0200, Ingo Molnar wrote:
> Before:
>
> BIOS-provided physical RAM map:
> BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
> BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
> BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
>
> After:
>
> BIOS-provided physical RAM map:
> BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] 639 KB kernel usable RAM
> BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] 1 KB reserved
> BIOS-e820: [gap 0x00000000000a0000-0x00000000000effff] 320 KB ...
> BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] 64 KB reserved
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
> BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] 144 KB reserved
> BIOS-e820: [gap 0x0000000080000000-0x00000000afffffff] 768 MB ...
> BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] 256 MB reserved
> BIOS-e820: [gap 0x00000000c0000000-0x00000000fed1bfff] 1005.1 MB ...
> BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] 16 KB reserved
> BIOS-e820: [gap 0x00000000fed20000-0x00000000feffbfff] 2.8 MB ...
> BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] 16 KB reserved
> BIOS-e820: [gap 0x00000000ff000000-0x00000000fffbffff] 15.7 MB ...
> BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] 256 KB reserved
> BIOS-e820: [gap 0x0000000100000000-0x000000fcffffffff] 1008 GB ...
> BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] 12 GB reserved
>
> Note how a 1-digit precision field is printed out if a range is
> fractional in its largest-enclosing natural size unit.
>
> So the "256 MB" and "12 GB" fields above denote exactly 256 MB and
> 12 GB regions, while "1.9 GB" signals the region's fractional nature
> and it being just below 2GB.
>
> Printing E820 maps with such details visualizes 'weird' ranges
> at a glance, and gives users a better understanding of how
> large the various ranges are, without having to perform hexadecimal
> subtraction in their minds.
>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Cc: Andy Shevchenko <andy@kernel.org>
> Cc: Arnd Bergmann <arnd@kernel.org>
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
> (cherry picked from commit d1ac6b8718575a7ea2f0a1ff347835a8879df673)
> ---
> arch/x86/kernel/e820.c | 47 +++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 45 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index 10bd10bd5672..8ee89962fcbf 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -199,6 +199,41 @@ static void __init e820_print_type(enum e820_type type)
> }
> }
>
> +/*
> + * Print out the size of a E820 region, in human-readable
> + * fashion, going from KB, MB, GB to TB units.
> + *
> + * Print out fractional sizes with a single digit of precision.
> + */
> +static void e820_print_size(u64 size)
> +{
> + if (size < SZ_1M) {
> + if (size & (SZ_1K-1))
> + pr_cont(" %4llu.%01llu KB", size/SZ_1K, 10*(size & (SZ_1K-1))/SZ_1K);
> + else
> + pr_cont(" %4llu KB", size/SZ_1K);
> + return;
> + }
I'd make this a helper, e.g
static void __e820_print_size(u64 size, u64 unit, const char *unit_name)
{
if (size & (unit - 1)) {
u64 fraction = 10 * (size & (unit - 1))/unit;
pr_cont(" %4llu.%01llu %s", size/unit, fraction, unit_name);
} else {
pr_cont(" %4llu %s", size/unit, unit_name);
}
}
> + if (size < SZ_1G) {
> + if (size & (SZ_1M-1))
> + pr_cont(" %4llu.%01llu MB", size/SZ_1M, 10*(size & (SZ_1M-1))/SZ_1M);
> + else
> + pr_cont(" %4llu MB", size/SZ_1M);
> + return;
> + }
> + if (size < SZ_1T) {
> + if (size & (SZ_1G-1))
> + pr_cont(" %4llu.%01llu GB", size/SZ_1G, 10*(size & (SZ_1G-1))/SZ_1G);
> + else
> + pr_cont(" %4llu GB", size/SZ_1G);
> + return;
> + }
> + if (size & (SZ_1T-1))
> + pr_cont(" %4llu.%01llu TB", size/SZ_1T, 10*(size & (SZ_1T-1))/SZ_1T);
> + else
> + pr_cont(" %4llu TB", size/SZ_1T);
> +}
> +
> static void __init e820__print_table(const char *who)
> {
> u64 range_end_prev = 0;
> @@ -215,14 +250,22 @@ static void __init e820__print_table(const char *who)
> if (range_start < range_end_prev)
> pr_info("BUG: out of order E820 entry!\n");
>
> + /* Print gaps, if any: */
> if (range_start > range_end_prev) {
> - pr_info("%s: [gap %#018Lx-%#018Lx]\n",
> + u64 gap_size = range_start - range_end_prev;
> +
> + pr_info("%s: [gap %#018Lx-%#018Lx]",
> who,
> range_end_prev,
> range_start-1);
> +
> + e820_print_size(gap_size);
> + pr_cont(" ...\n");
> }
>
> - pr_info("%s: [mem %#018Lx-%#018Lx] ", who, range_start, range_end-1);
> + /* Print allocated ranges: */
> + pr_info("%s: [mem %#018Lx-%#018Lx]", who, range_start, range_end-1);
> + e820_print_size(entry->size);
> e820_print_type(entry->type);
> pr_cont("\n");
>
> --
> 2.45.2
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
` (28 preceding siblings ...)
2025-04-21 18:52 ` [PATCH 29/29] x86/boot/e820: Treat non-type-2 'reserved' E820 region types as E820_TYPE_RESERVED Ingo Molnar
@ 2025-04-22 6:58 ` Arnd Bergmann
2025-05-15 11:56 ` Ingo Molnar
29 siblings, 1 reply; 68+ messages in thread
From: Arnd Bergmann @ 2025-04-22 6:58 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: Andy Shevchenko, Borislav Petkov, Juergen Gross, H. Peter Anvin,
Kees Cook, Linus Torvalds, Mike Rapoport, Paul Menzel,
Peter Zijlstra, Thomas Gleixner
On Mon, Apr 21, 2025, at 20:51, Ingo Molnar wrote:
>
> - Assorted cleanups: type cleanups, simplifications, standardization
> of coding patterns, etc.
Since you are already looking at cleaning up the types and testing
a lot, I wonder if you could make sure this also works for a 32-bit
phys_addr_t when booting a 32-bit kernel with and without
CONFIG_X86_PAE. In my recent cleanup series I originally
changed phys_addr_t to 32 bit after removing CONFIG_HIGHMEM_64G,
but this caused regressions, so it's still left as u64 even
though it should not be needed any more.
Arnd
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 08/29] x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries
2025-04-21 18:51 ` [PATCH 08/29] x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries Ingo Molnar
2025-04-22 6:31 ` Andy Shevchenko
@ 2025-04-22 7:06 ` Mike Rapoport
2025-05-15 11:19 ` [PATCH 30/29] x86/boot/e820: Unify e820_print_type() and e820_type_to_string() Ingo Molnar
1 sibling, 1 reply; 68+ messages in thread
From: Mike Rapoport @ 2025-04-22 7:06 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
On Mon, Apr 21, 2025 at 08:51:48PM +0200, Ingo Molnar wrote:
> So it is a bit weird that the actual RAM entries of the E820 table
> are not actually called RAM, but 'usable':
>
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB usable
>
> 'usable' is pretty passive-aggressive in that context and ambiguous,
> most E820 entries denote 'usable' address ranges - reserved ranges
> may be used by devices, or the platform.
>
> Clarify and disambiguate this by making the boot log entry
> explicitly say 'kernel usable RAM':
>
> BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Cc: Andy Shevchenko <andy@kernel.org>
> Cc: Arnd Bergmann <arnd@kernel.org>
> Cc: David Woodhouse <dwmw@amazon.co.uk>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---
> arch/x86/kernel/e820.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> index 8ee89962fcbf..99f997ae88dc 100644
> --- a/arch/x86/kernel/e820.c
> +++ b/arch/x86/kernel/e820.c
> @@ -187,7 +187,7 @@ void __init e820__range_add(u64 start, u64 size, enum e820_type type)
> static void __init e820_print_type(enum e820_type type)
> {
> switch (type) {
> - case E820_TYPE_RAM: pr_cont(" usable"); break;
> + case E820_TYPE_RAM: pr_cont(" kernel usable RAM"); break;
> case E820_TYPE_RESERVED: pr_cont(" reserved"); break;
> case E820_TYPE_SOFT_RESERVED: pr_cont(" soft reserved"); break;
> case E820_TYPE_ACPI: pr_cont(" ACPI data"); break;
We have e820_type_to_string(), IMO the whole switch here can be replaced by
pr_cont(" %s", e820_type_to_string(type));
> --
> 2.45.2
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 14/29] x86/boot/e820: Clean up __e820__range_add() a bit
2025-04-21 18:51 ` [PATCH 14/29] x86/boot/e820: Clean up __e820__range_add() a bit Ingo Molnar
@ 2025-04-22 10:10 ` Andy Shevchenko
2025-05-15 11:21 ` Ingo Molnar
0 siblings, 1 reply; 68+ messages in thread
From: Andy Shevchenko @ 2025-04-22 10:10 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
On Mon, Apr 21, 2025 at 08:51:54PM +0200, Ingo Molnar wrote:
> - Use 'idx' index variable instead of a weird 'x'
> - Make the error message E820-specific
> - Group the code a bit better
...
> - if (x >= ARRAY_SIZE(table->entries)) {
> - pr_err("too many entries; ignoring [mem %#010llx-%#010llx]\n",
> - start, start + size - 1);
> + if (idx >= ARRAY_SIZE(table->entries)) {
> + pr_err("too many E820 table entries; ignoring [mem %#010llx-%#010llx]\n",
> + start, start + size-1);
size - 1 ?
> return;
> }
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 17/29] x86/boot/e820: Standardize e820 table index variable names under 'idx'
2025-04-21 18:51 ` [PATCH 17/29] x86/boot/e820: Standardize e820 table index variable names under 'idx' Ingo Molnar
@ 2025-04-22 10:23 ` Andy Shevchenko
2025-05-15 11:32 ` [PATCH 31/29] x86/boot/e820: Move index increments outside accessors in e820__update_table() Ingo Molnar
2025-05-15 11:37 ` [PATCH 17/29] x86/boot/e820: Standardize e820 table index variable names under 'idx' Ingo Molnar
0 siblings, 2 replies; 68+ messages in thread
From: Andy Shevchenko @ 2025-04-22 10:23 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
On Mon, Apr 21, 2025 at 08:51:57PM +0200, Ingo Molnar wrote:
...
> + int idx;
> + u32 idx, chg_idx, chg_nr;
What about sanitizing the type as well to be let's say unsigned int idx in all cases?
...
> + change_point[chg_idx]->addr = entries[idx].addr;
> + change_point[chg_idx++]->entry = &entries[idx];
> + change_point[chg_idx]->addr = entries[idx].addr + entries[idx].size;
> + change_point[chg_idx++]->entry = &entries[idx];
Does GCC 15 not produce any warnings here? Linus recently complain on some code
with index increment inside the accessor. Perhaps just
change_point[chg_idx]->entry = &entries[idx];
chg_idx++;
?
...
> + for (idx = 0; idx < overlap_entries; idx++) {
> + if (overlap_list[idx] == change_point[chg_idx]->entry)
> + overlap_list[idx] = overlap_list[overlap_entries-1];
overlap_entries - 1 ?
> }
...
> + while (--idx >= 0) {
while (idx--) {
should work as well, no?
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 19/29] x86/boot/e820: Standardize e820 table index variable types under 'u32'
2025-04-21 18:51 ` [PATCH 19/29] x86/boot/e820: Standardize e820 table index variable types under 'u32' Ingo Molnar
@ 2025-04-22 15:13 ` Andy Shevchenko
2025-05-15 11:39 ` Ingo Molnar
0 siblings, 1 reply; 68+ messages in thread
From: Andy Shevchenko @ 2025-04-22 15:13 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
On Mon, Apr 21, 2025 at 08:51:59PM +0200, Ingo Molnar wrote:
> So we have 'idx' types of 'int' and 'unsigned int', and sometimes
> we assign 'u32' fields such as e820_table::nr_entries to these 'int'
> values.
>
> While there's no real risk of overflow with these tables, make it
> all cleaner by standardizing on a single type: u32.
>
> This also happens to shrink the code a bit:
>
> text data bss dec hex filename
> 7745 44072 0 51817 ca69 e820.o.before
> 7613 44072 0 51685 c9e5 e820.o.after
Ah, here it is! You can ignore my respective comment in one of the previous
patches. Perhaps better to group that one (which converts to use idx) and this
one, so they will be sequential in the series?
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 20/29] x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit
2025-04-21 18:52 ` [PATCH 20/29] x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit Ingo Molnar
@ 2025-04-22 16:37 ` Andy Shevchenko
2025-05-15 11:44 ` Ingo Molnar
0 siblings, 1 reply; 68+ messages in thread
From: Andy Shevchenko @ 2025-04-22 16:37 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
On Mon, Apr 21, 2025 at 08:52:00PM +0200, Ingo Molnar wrote:
> Use a bit more readable variable names, we haven't run out of
> underscore characters in the kernel yet.
Size notes:
> + gap_size = 0x400000;
SZ_4M ?
...
> + gap_start = (max_pfn << PAGE_SHIFT) + 1024*1024;
SZ_1M ?
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 27/29] x86/boot/e820: Simplify the e820__range_remove() API
2025-04-21 18:52 ` [PATCH 27/29] x86/boot/e820: Simplify the e820__range_remove() API Ingo Molnar
@ 2025-04-22 16:41 ` Andy Shevchenko
2025-05-15 11:49 ` Ingo Molnar
0 siblings, 1 reply; 68+ messages in thread
From: Andy Shevchenko @ 2025-04-22 16:41 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
On Mon, Apr 21, 2025 at 08:52:07PM +0200, Ingo Molnar wrote:
> Right now e820__range_remove() has two parameters to control the
> E820 type of the range removed:
>
> extern void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
>
> Since E820 types start at 1, zero has a natural meaning of 'no type.
>
> Consolidate the (old_type,check_type) parameters into a single (filter_type)
> parameter:
>
> extern void e820__range_remove(u64 start, u64 size, enum e820_type filter_type);
>
> Note that both e820__mapped_raw_any() and e820__mapped_any()
> already have such semantics for their 'type' parameter, although
> it's currently not used with '0' by in-kernel code.
>
> Also, the __e820__mapped_all() internal helper already has such
> semantics implemented as well, and the e820__get_entry_type() API
> uses the '0' type to such effect.
>
> This simplifies not just e820__range_remove(), and synchronizes its
> use of type filters with other E820 API functions, but simplifies
> usage sites as well, such as parse_memmap_one(), beyond the reduction
> of the number of parameters:
>
> - else if (from)
> - e820__range_remove(start_at, mem_size, from, 1);
> else
> - e820__range_remove(start_at, mem_size, 0, 0);
> + e820__range_remove(start_at, mem_size, from);
>
> The generated code gets smaller as well:
>
> add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-66 (-66)
>
> Function old new delta
> parse_memopt 112 107 -5
> efi_init 1048 1039 -9
> setup_arch 2719 2709 -10
> e820__range_remove 283 273 -10
> parse_memmap_opt 559 527 -32
>
> Total: Before=22,675,600, After=22,675,534, chg -0.00%
> extern void e820__range_add (u64 start, u64 size, enum e820_type type);
> extern u64 e820__range_update(u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
> -extern void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
> +extern void e820__range_remove(u64 start, u64 size, enum e820_type filter_type);
> extern u64 e820__range_update_table(struct e820_table *t, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
>
> extern int e820__update_table(struct e820_table *table);
Wondering if are going to get rid of 'extern' for the functions...
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 29/29] x86/boot/e820: Treat non-type-2 'reserved' E820 region types as E820_TYPE_RESERVED
2025-04-21 18:52 ` [PATCH 29/29] x86/boot/e820: Treat non-type-2 'reserved' E820 region types as E820_TYPE_RESERVED Ingo Molnar
@ 2025-04-25 3:55 ` H. Peter Anvin
2025-05-15 10:04 ` [PATCH -v2 29/29] x86/boot/e820: Introduce E820_TYPE_13 and treat it as a device region Ingo Molnar
0 siblings, 1 reply; 68+ messages in thread
From: H. Peter Anvin @ 2025-04-25 3:55 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: Andy Shevchenko, Arnd Bergmann, Borislav Petkov, Juergen Gross,
Kees Cook, Linus Torvalds, Mike Rapoport, Paul Menzel,
Peter Zijlstra, Thomas Gleixner, David Woodhouse
On 4/21/25 11:52, Ingo Molnar wrote:
> Paul Menzel pointed out that ACPI specification 6.3 defines 'reserved'
> E820 region types as E820_TYPE_RESERVED (type 2):
>
> > Table 15-374 *Address Range Types* in the ACPI specification 6.3 says:
> >
> > > Reserved for future use. OSPM must treat any range of this type as if
> > > the type returned was AddressRangeReserved.
>
> This has relevance for device address regions, which on some firmware such
> as CoreBoot, get passed to Linux as type-13 - which the kernel
> treats as system regions and registers them as unavailable to drivers:
>
... so we should handle 13 accordingly (and probably request that the
ACPI committee permanently reserve it. It would have been better to use
negative numbers for OS-specific things.)
However, if we run into a value that we have never seen, say something
like 84, we shouldn't assume that it is safe to do anything at all to
it; in particular we really don't want to assume that it is safe to
place I/O devices there.
Note that devices may be a priori set up in type 2 memory; it pretty
much means "this device is treated specially by firmware, don't move it
around or bad things will happen."
-hpa
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH -v2 29/29] x86/boot/e820: Introduce E820_TYPE_13 and treat it as a device region
2025-04-25 3:55 ` H. Peter Anvin
@ 2025-05-15 10:04 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 10:04 UTC (permalink / raw)
To: H. Peter Anvin
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
* H. Peter Anvin <hpa@zytor.com> wrote:
> On 4/21/25 11:52, Ingo Molnar wrote:
> > Paul Menzel pointed out that ACPI specification 6.3 defines 'reserved'
> > E820 region types as E820_TYPE_RESERVED (type 2):
> >
> > > Table 15-374 *Address Range Types* in the ACPI specification 6.3 says:
> > >
> > > > Reserved for future use. OSPM must treat any range of this type as if
> > > > the type returned was AddressRangeReserved.
> >
> > This has relevance for device address regions, which on some firmware such
> > as CoreBoot, get passed to Linux as type-13 - which the kernel
> > treats as system regions and registers them as unavailable to drivers:
> >
>
> ... so we should handle 13 accordingly (and probably request that the
> ACPI committee permanently reserve it. It would have been better to
> use negative numbers for OS-specific things.)
>
> However, if we run into a value that we have never seen, say
> something like 84, we shouldn't assume that it is safe to do anything
> at all to it; in particular we really don't want to assume that it is
> safe to place I/O devices there.
>
> Note that devices may be a priori set up in type 2 memory; it pretty
> much means "this device is treated specially by firmware, don't move
> it around or bad things will happen."
Okay, agreed, this approach makes a lot of sense.
How about the replacement patch below? It basically implements your
recommendation: E820_TYPE_13 follows E820_TYPE_RESERVED behavior
(device region), while other unknown types are still treated
conservatively: system region, don't touch, don't merge.
Thanks,
Ingo
====================================>
From: Ingo Molnar <mingo@kernel.org>
Date: Sat, 19 Apr 2025 21:50:24 +0200
Subject: [PATCH] x86/boot/e820: Introduce E820_TYPE_13 and treat it as a device region
Paul Menzel pointed out that ACPI specification 6.3 defines 'reserved'
E820 region types as E820_TYPE_RESERVED (type 2):
> Table 15-374 *Address Range Types* in the ACPI specification 6.3 says:
>
> > Reserved for future use. OSPM must treat any range of this type as if
> > the type returned was AddressRangeReserved.
This has relevance for device address regions, which on some firmware such
as CoreBoot, get passed to Linux as type-13 - which the kernel
treats as system regions and registers them as unavailable to drivers:
static bool __init e820_device_region(enum e820_type type, struct resource *res)
...
case E820_TYPE_ACPI:
case E820_TYPE_NVS:
case E820_TYPE_UNUSABLE:
default:
return false;
Users of such systems will see device breakage on Linux, which they
have to work around with iomem=relaxed kind of boot time hacks to
turn off resource conflict checking.
Partially follow the ACPI spec and add a limited quirk for the
E820_TYPE_13 type, and allow it to be claimed by device drivers
(similarly to E820_TYPE_RESERVED).
Don't change behavior for other unknown types.
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/include/asm/e820/types.h | 4 ++++
arch/x86/kernel/e820.c | 11 +++++++++--
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/e820/types.h b/arch/x86/include/asm/e820/types.h
index df12f7ee75d3..2430120c2528 100644
--- a/arch/x86/include/asm/e820/types.h
+++ b/arch/x86/include/asm/e820/types.h
@@ -27,6 +27,10 @@ enum e820_type {
* 6 was assigned differently. Some time they will learn... )
*/
E820_TYPE_PRAM = 12,
+ /*
+ * Certain firmware such as CoreBoot uses this type:
+ */
+ E820_TYPE_13 = 13,
/*
* Special-purpose memory is indicated to the system via the
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 6649d49c9c0f..e2579e385181 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1075,7 +1075,7 @@ __init static const char * e820_type_to_string(struct e820_entry *entry)
case E820_TYPE_PRAM: return "Persistent Memory (legacy)";
case E820_TYPE_PMEM: return "Persistent Memory";
case E820_TYPE_RESERVED: return "Reserved";
- case E820_TYPE_SOFT_RESERVED: return "Soft Reserved";
+ case E820_TYPE_13: return "Type 13";
default: return "Unknown E820 type";
}
}
@@ -1090,6 +1090,7 @@ __init static unsigned long e820_type_to_iomem_type(struct e820_entry *entry)
case E820_TYPE_PRAM: /* Fall-through: */
case E820_TYPE_PMEM: /* Fall-through: */
case E820_TYPE_RESERVED: /* Fall-through: */
+ case E820_TYPE_13: /* Fall-through: */
case E820_TYPE_SOFT_RESERVED: /* Fall-through: */
default: return IORESOURCE_MEM;
}
@@ -1102,7 +1103,8 @@ __init static unsigned long e820_type_to_iores_desc(struct e820_entry *entry)
case E820_TYPE_NVS: return IORES_DESC_ACPI_NV_STORAGE;
case E820_TYPE_PMEM: return IORES_DESC_PERSISTENT_MEMORY;
case E820_TYPE_PRAM: return IORES_DESC_PERSISTENT_MEMORY_LEGACY;
- case E820_TYPE_RESERVED: return IORES_DESC_RESERVED;
+ case E820_TYPE_RESERVED: /* Fall-through: */
+ case E820_TYPE_13: return IORES_DESC_RESERVED;
case E820_TYPE_SOFT_RESERVED: return IORES_DESC_SOFT_RESERVED;
case E820_TYPE_RAM: /* Fall-through: */
case E820_TYPE_UNUSABLE: /* Fall-through: */
@@ -1132,6 +1134,7 @@ __init static bool e820_device_region(enum e820_type type, struct resource *res)
*/
switch (type) {
case E820_TYPE_RESERVED:
+ case E820_TYPE_13:
case E820_TYPE_SOFT_RESERVED:
case E820_TYPE_PRAM:
case E820_TYPE_PMEM:
@@ -1140,6 +1143,10 @@ __init static bool e820_device_region(enum e820_type type, struct resource *res)
case E820_TYPE_ACPI:
case E820_TYPE_NVS:
case E820_TYPE_UNUSABLE:
+ /*
+ * Unknown E820 types should be treated passively, here we
+ * don't allow them to be claimed by device drivers:
+ */
default:
return false;
}
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PATCH 02/29] x86/boot/e820: Simplify e820__print_table() a bit
2025-04-22 5:56 ` Mike Rapoport
@ 2025-05-15 10:15 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 10:15 UTC (permalink / raw)
To: Mike Rapoport
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
* Mike Rapoport <rppt@kernel.org> wrote:
> On Mon, Apr 21, 2025 at 08:51:42PM +0200, Ingo Molnar wrote:
> > Introduce 'entry' for the current table entry and shorten
> > repetitious use of e820_table->entries[i].
> >
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > Cc: Andy Shevchenko <andy@kernel.org>
> > Cc: Arnd Bergmann <arnd@kernel.org>
> > Cc: David Woodhouse <dwmw@amazon.co.uk>
> > Cc: H. Peter Anvin <hpa@zytor.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
> > ---
> > arch/x86/kernel/e820.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> > index 4a81f9e94137..b1a30bca56cd 100644
> > --- a/arch/x86/kernel/e820.c
> > +++ b/arch/x86/kernel/e820.c
> > @@ -204,12 +204,14 @@ void __init e820__print_table(char *who)
> > int i;
> >
> > for (i = 0; i < e820_table->nr_entries; i++) {
> > + struct e820_entry *entry = e820_table->entries + i;
> > +
> > pr_info("%s: [mem %#018Lx-%#018Lx] ",
> > who,
> > - e820_table->entries[i].addr,
> > - e820_table->entries[i].addr + e820_table->entries[i].size - 1);
> > + entry->addr,
> > + entry->addr + entry->size-1);
>
> nit: entry->size - 1
So it's a judgement call, and here IMHO it's easier on the eyes and
more straightforward when entry->size-1 is grouped together visually,
which is the offset of the last byte in this E820 region.
I don't agree with the mindless conversion patches that do this
indiscriminately:
s/x-1
/x - 1
Typographically and stylistically there's nothing wrong with 'x-1'.
In fact the grouping has advantages, for example this hypothethical
buggy piece of code:
addr_next - size-1
is much more obviously incorrect 'at a glance' during review than this
one:
addr_next - size - 1
Thanks,
Ingo
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 03/29] x86/boot/e820: Simplify the PPro Erratum #50 workaround
2025-04-22 6:29 ` Andy Shevchenko
@ 2025-05-15 10:20 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 10:20 UTC (permalink / raw)
To: Andy Shevchenko
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Mike Rapoport, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
* Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
> Mon, Apr 21, 2025 at 08:51:43PM +0200, Ingo Molnar kirjoitti:
> > No need to print out the table - users won't really be able
> > to tell much from it anyway and the messages around this
> > erratum are unnecessarily obtuse.
> >
> > Instead clearly inform the user that a 256 kB hole is being
> > punched in their memory map at the 1.75 GB physical address.
> >
> > Not that there are many PPro users left. :-)
>
> ...
>
> > + e820__range_update(0x70000000ULL, 0x40000ULL, E820_TYPE_RAM, E820_TYPE_RESERVED);
>
> Perhaps, 0x40000ULL --> SZ_256K ?
Indeed! I folded in the change below.
Thanks,
Ingo
==============================>
arch/x86/kernel/setup.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index a1e2d64bb28d..f40dffc3e014 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -985,7 +985,7 @@ void __init setup_arch(char **cmdline_p)
#ifdef CONFIG_X86_32
if (ppro_with_ram_bug()) {
pr_info("Applying PPro RAM bug workaround: punching 256 kB hole at 1.75 GB physical.\n");
- e820__range_update(0x70000000ULL, 0x40000ULL, E820_TYPE_RAM, E820_TYPE_RESERVED);
+ e820__range_update(0x70000000ULL, SZ_256K, E820_TYPE_RAM, E820_TYPE_RESERVED);
e820__update_table(e820_table);
}
#else
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table
2025-04-22 6:26 ` Andy Shevchenko
@ 2025-05-15 10:23 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 10:23 UTC (permalink / raw)
To: Andy Shevchenko
Cc: Mike Rapoport, linux-kernel, Andy Shevchenko, Arnd Bergmann,
Borislav Petkov, Juergen Gross, H . Peter Anvin, Kees Cook,
Linus Torvalds, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
* Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
> Tue, Apr 22, 2025 at 09:00:00AM +0300, Mike Rapoport kirjoitti:
> > On Mon, Apr 21, 2025 at 08:51:45PM +0200, Ingo Molnar wrote:
>
> ...
>
> > > + if (range_start < range_end_prev)
> > > + pr_info("BUG: out of order E820 entry!\n");
> >
> > Maybe "BIOS BUG"?
>
> FWIW, we have FW_BUG definition, with that
>
> pr_info(FW_BUG "out of order E820 entry!\n");
Agreed, I've folded in the change below.
Thanks,
Ingo
=============================>
arch/x86/kernel/e820.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 36b74b233bd6..c8833ac9cbf9 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -213,7 +213,7 @@ static void __init e820__print_table(const char *who)
/* Out of order E820 maps should not happen: */
if (range_start < range_end_prev)
- pr_info("BUG: out of order E820 entry!\n");
+ pr_info(FW_BUG "out of order E820 entry!\n");
if (range_start > range_end_prev) {
pr_info("%s: [gap %#018Lx-%#018Lx]\n",
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table
2025-04-22 6:42 ` Andy Shevchenko
@ 2025-05-15 10:28 ` Ingo Molnar
2025-05-16 9:13 ` Andy Shevchenko
0 siblings, 1 reply; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 10:28 UTC (permalink / raw)
To: Andy Shevchenko
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Mike Rapoport, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
* Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
> Mon, Apr 21, 2025 at 08:51:45PM +0200, Ingo Molnar kirjoitti:
> > Gaps in the E820 table are not obvious at a glance and can
> > easily be overlooked.
> >
> > Print out gaps in the E820 table:
> >
> > Before:
> >
> > BIOS-provided physical RAM map:
> > BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> > BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> > BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
> > BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
> > BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> > BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> > BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> > BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> > BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
> > After:
> >
> > BIOS-provided physical RAM map:
> > BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> > BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> > BIOS-e820: [gap 0x00000000000a0000-0x00000000000effff]
> > BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
> > BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
> > BIOS-e820: [gap 0x0000000080000000-0x00000000afffffff]
> > BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> > BIOS-e820: [gap 0x00000000c0000000-0x00000000fed1bfff]
> > BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> > BIOS-e820: [gap 0x00000000fed20000-0x00000000feffbfff]
> > BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> > BIOS-e820: [gap 0x00000000ff000000-0x00000000fffbffff]
> > BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> > BIOS-e820: [gap 0x0000000100000000-0x000000fcffffffff]
> > BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
> >
> > Also warn about badly ordered E820 table entries:
> >
> > BUG: out of order E820 entry!
> >
> > ( this is printed before the entry is printed, so there's no need to
> > print any additional data with the warning. )
>
> ...
>
> > + u64 range_start, range_end;
>
> struct range (from range.h) and...
Yeah, using those primitives makes sense, but right now the e820 code
isn't using them, and it's better to have similar & unified range
handling code patterns.
In principle I wouldn't be opposed to patches that convert the e820
code to <linux/range.h> types.
>
> > + range_start = entry->addr;
> > + range_end = entry->addr + entry->size;
> >
> > + /* Out of order E820 maps should not happen: */
> > + if (range_start < range_end_prev)
> > + pr_info("BUG: out of order E820 entry!\n");
> > +
> > + if (range_start > range_end_prev) {
> > + pr_info("%s: [gap %#018Lx-%#018Lx]\n",
> > + who,
> > + range_end_prev,
> > + range_start-1);
>
> %pra
This would be part of any <linux/range.h> conversion patches.
> with who mentioned the "gap"?
Not sure I understand?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges
2025-04-22 6:38 ` Andy Shevchenko
@ 2025-05-15 10:44 ` Ingo Molnar
2025-05-15 10:50 ` Ingo Molnar
2025-05-16 9:15 ` Andy Shevchenko
0 siblings, 2 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 10:44 UTC (permalink / raw)
To: Andy Shevchenko
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Mike Rapoport, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
* Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
> Mon, Apr 21, 2025 at 08:51:47PM +0200, Ingo Molnar kirjoitti:
> > Before:
> >
> > BIOS-provided physical RAM map:
> > BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> > BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> > BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
> > BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
> > BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> > BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> > BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> > BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> > BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
> >
> > After:
> >
> > BIOS-provided physical RAM map:
> > BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] 639 KB kernel usable RAM
> > BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] 1 KB reserved
> > BIOS-e820: [gap 0x00000000000a0000-0x00000000000effff] 320 KB ...
> > BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] 64 KB reserved
> > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
> > BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] 144 KB reserved
> > BIOS-e820: [gap 0x0000000080000000-0x00000000afffffff] 768 MB ...
> > BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] 256 MB reserved
> > BIOS-e820: [gap 0x00000000c0000000-0x00000000fed1bfff] 1005.1 MB ...
> > BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] 16 KB reserved
> > BIOS-e820: [gap 0x00000000fed20000-0x00000000feffbfff] 2.8 MB ...
> > BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] 16 KB reserved
> > BIOS-e820: [gap 0x00000000ff000000-0x00000000fffbffff] 15.7 MB ...
> > BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] 256 KB reserved
> > BIOS-e820: [gap 0x0000000100000000-0x000000fcffffffff] 1008 GB ...
> > BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] 12 GB reserved
> >
> > Note how a 1-digit precision field is printed out if a range is
> > fractional in its largest-enclosing natural size unit.
> >
> > So the "256 MB" and "12 GB" fields above denote exactly 256 MB and
> > 12 GB regions, while "1.9 GB" signals the region's fractional nature
> > and it being just below 2GB.
> >
> > Printing E820 maps with such details visualizes 'weird' ranges
> > at a glance, and gives users a better understanding of how
> > large the various ranges are, without having to perform hexadecimal
> > subtraction in their minds.
>
> ...
>
> > +/*
> > + * Print out the size of a E820 region, in human-readable
> > + * fashion, going from KB, MB, GB to TB units.
> > + *
> > + * Print out fractional sizes with a single digit of precision.
> > + */
> > +static void e820_print_size(u64 size)
> > +{
> > + if (size < SZ_1M) {
> > + if (size & (SZ_1K-1))
> > + pr_cont(" %4llu.%01llu KB", size/SZ_1K, 10*(size & (SZ_1K-1))/SZ_1K);
> > + else
> > + pr_cont(" %4llu KB", size/SZ_1K);
>
> I would add some spaces here and there for the sake of readability.
I think it's perfectly readable, skipping the whitespace for numeric
literals is standard style. Linus himself does that occasionally, see:
94a2bc0f611c ("arm64: add 'runtime constant' support")
static inline void __runtime_fixup_ptr(void *where, unsigned long val)
{
__le32 *p = lm_alias(where);
__runtime_fixup_16(p, val);
__runtime_fixup_16(p+1, val >> 16);
__runtime_fixup_16(p+2, val >> 32);
__runtime_fixup_16(p+3, val >> 48);
__runtime_fixup_caches(where, 4);
}
Or:
938df695e98d ("vsprintf: associate the format state with the format pointer")
+ unsigned int shift = 32 - size*8;
which uses visual grouping to make arithmethic expressions more
readable.
>
> > + return;
> > + }
> > + if (size < SZ_1G) {
>
> Can be written in one line as
>
> } else if (...) {
Done. (See delta patch below.)
>
> Ditto for the rest.
>
> > + if (size & (SZ_1M-1))
> > + pr_cont(" %4llu.%01llu MB", size/SZ_1M, 10*(size & (SZ_1M-1))/SZ_1M);
> > + else
> > + pr_cont(" %4llu MB", size/SZ_1M);
> > + return;
> > + }
> > + if (size < SZ_1T) {
> > + if (size & (SZ_1G-1))
> > + pr_cont(" %4llu.%01llu GB", size/SZ_1G, 10*(size & (SZ_1G-1))/SZ_1G);
> > + else
> > + pr_cont(" %4llu GB", size/SZ_1G);
> > + return;
> > + }
> > + if (size & (SZ_1T-1))
> > + pr_cont(" %4llu.%01llu TB", size/SZ_1T, 10*(size & (SZ_1T-1))/SZ_1T);
> > + else
> > + pr_cont(" %4llu TB", size/SZ_1T);
> > +}
>
> Don't you want to use string_helpers.h provided API?
> string_get_size().
I don't think string_get_size() knows the fine distinction between:
BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] 256 KB device reserved
and:
BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] 256.0 KB device reserved
"256 KB" is exactly 256 KB, while "256.0 KB" denotes a value that is a
bit larger than 256 KB but rounds down to 256 KB at 1 KB granularity.
When reading platform boot logs it's useful to know when such values
are exact, at a glance.
Thanks,
Ingo
====================>
arch/x86/kernel/e820.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 7f600d32a999..67a477203c97 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -213,22 +213,19 @@ static void e820_print_size(u64 size)
else
pr_cont(" %4llu KB", size/SZ_1K);
return;
- }
- if (size < SZ_1G) {
+ } else if (size < SZ_1G) {
if (size & (SZ_1M-1))
pr_cont(" %4llu.%01llu MB", size/SZ_1M, 10*(size & (SZ_1M-1))/SZ_1M);
else
pr_cont(" %4llu MB", size/SZ_1M);
return;
- }
- if (size < SZ_1T) {
+ } else if (size < SZ_1T) {
if (size & (SZ_1G-1))
pr_cont(" %4llu.%01llu GB", size/SZ_1G, 10*(size & (SZ_1G-1))/SZ_1G);
else
pr_cont(" %4llu GB", size/SZ_1G);
return;
- }
- if (size & (SZ_1T-1))
+ } else if (size & (SZ_1T-1))
pr_cont(" %4llu.%01llu TB", size/SZ_1T, 10*(size & (SZ_1T-1))/SZ_1T);
else
pr_cont(" %4llu TB", size/SZ_1T);
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges
2025-05-15 10:44 ` Ingo Molnar
@ 2025-05-15 10:50 ` Ingo Molnar
2025-05-16 9:15 ` Andy Shevchenko
1 sibling, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 10:50 UTC (permalink / raw)
To: Andy Shevchenko
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Mike Rapoport, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
* Ingo Molnar <mingo@kernel.org> wrote:
> > > + }
> > > + if (size < SZ_1G) {
> >
> > Can be written in one line as
> >
> > } else if (...) {
>
> Done. (See delta patch below.)
Actually, I take this back, as there's a return in the branch above:
pr_cont(" %4llu MB", size/SZ_1M);
return;
}
if (size < SZ_1T) {
Which makes the plain 'if' more readable: the previous 'if' branches
off entirely and control flow never gets back, so mix the blocks with
'else if' looks a bit weird.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges
2025-04-22 6:53 ` Mike Rapoport
@ 2025-05-15 11:00 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 11:00 UTC (permalink / raw)
To: Mike Rapoport
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
* Mike Rapoport <rppt@kernel.org> wrote:
> On Mon, Apr 21, 2025 at 08:51:47PM +0200, Ingo Molnar wrote:
> > Before:
> >
> > BIOS-provided physical RAM map:
> > BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
> > BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
> > BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
> > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] usable
> > BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] reserved
> > BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] reserved
> > BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
> > BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
> > BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
> > BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
> >
> > After:
> >
> > BIOS-provided physical RAM map:
> > BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] 639 KB kernel usable RAM
> > BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] 1 KB reserved
> > BIOS-e820: [gap 0x00000000000a0000-0x00000000000effff] 320 KB ...
> > BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] 64 KB reserved
> > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
> > BIOS-e820: [mem 0x000000007ffdc000-0x000000007fffffff] 144 KB reserved
> > BIOS-e820: [gap 0x0000000080000000-0x00000000afffffff] 768 MB ...
> > BIOS-e820: [mem 0x00000000b0000000-0x00000000bfffffff] 256 MB reserved
> > BIOS-e820: [gap 0x00000000c0000000-0x00000000fed1bfff] 1005.1 MB ...
> > BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] 16 KB reserved
> > BIOS-e820: [gap 0x00000000fed20000-0x00000000feffbfff] 2.8 MB ...
> > BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] 16 KB reserved
> > BIOS-e820: [gap 0x00000000ff000000-0x00000000fffbffff] 15.7 MB ...
> > BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] 256 KB reserved
> > BIOS-e820: [gap 0x0000000100000000-0x000000fcffffffff] 1008 GB ...
> > BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] 12 GB reserved
> >
> > Note how a 1-digit precision field is printed out if a range is
> > fractional in its largest-enclosing natural size unit.
> >
> > So the "256 MB" and "12 GB" fields above denote exactly 256 MB and
> > 12 GB regions, while "1.9 GB" signals the region's fractional nature
> > and it being just below 2GB.
> >
> > Printing E820 maps with such details visualizes 'weird' ranges
> > at a glance, and gives users a better understanding of how
> > large the various ranges are, without having to perform hexadecimal
> > subtraction in their minds.
> >
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > Cc: Andy Shevchenko <andy@kernel.org>
> > Cc: Arnd Bergmann <arnd@kernel.org>
> > Cc: David Woodhouse <dwmw@amazon.co.uk>
> > Cc: H. Peter Anvin <hpa@zytor.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
> > (cherry picked from commit d1ac6b8718575a7ea2f0a1ff347835a8879df673)
> > ---
> > arch/x86/kernel/e820.c | 47 +++++++++++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 45 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> > index 10bd10bd5672..8ee89962fcbf 100644
> > --- a/arch/x86/kernel/e820.c
> > +++ b/arch/x86/kernel/e820.c
> > @@ -199,6 +199,41 @@ static void __init e820_print_type(enum e820_type type)
> > }
> > }
> >
> > +/*
> > + * Print out the size of a E820 region, in human-readable
> > + * fashion, going from KB, MB, GB to TB units.
> > + *
> > + * Print out fractional sizes with a single digit of precision.
> > + */
> > +static void e820_print_size(u64 size)
> > +{
> > + if (size < SZ_1M) {
> > + if (size & (SZ_1K-1))
> > + pr_cont(" %4llu.%01llu KB", size/SZ_1K, 10*(size & (SZ_1K-1))/SZ_1K);
> > + else
> > + pr_cont(" %4llu KB", size/SZ_1K);
> > + return;
> > + }
>
> I'd make this a helper, e.g
>
> static void __e820_print_size(u64 size, u64 unit, const char *unit_name)
> {
> if (size & (unit - 1)) {
> u64 fraction = 10 * (size & (unit - 1))/unit;
>
> pr_cont(" %4llu.%01llu %s", size/unit, fraction, unit_name);
> } else {
> pr_cont(" %4llu %s", size/unit, unit_name);
> }
> }
While I like the helper in principle, it doesn't work on 32-bit: it's
an u64 with u64 division, while with the open coded literals the
compiler figures it out.
With div64_u64(), or a macro, it's not nearly as obvious a cleanup
IMHO.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 08/29] x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries
2025-04-22 6:43 ` Mike Rapoport
@ 2025-05-15 11:04 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 11:04 UTC (permalink / raw)
To: Mike Rapoport
Cc: Andy Shevchenko, linux-kernel, Andy Shevchenko, Arnd Bergmann,
Borislav Petkov, Juergen Gross, H . Peter Anvin, Kees Cook,
Linus Torvalds, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
* Mike Rapoport <rppt@kernel.org> wrote:
> On Tue, Apr 22, 2025 at 09:31:31AM +0300, Andy Shevchenko wrote:
> > Mon, Apr 21, 2025 at 08:51:48PM +0200, Ingo Molnar kirjoitti:
> > > So it is a bit weird that the actual RAM entries of the E820 table
> > > are not actually called RAM, but 'usable':
> > >
> > > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB usable
> > >
> > > 'usable' is pretty passive-aggressive in that context and ambiguous,
> > > most E820 entries denote 'usable' address ranges - reserved ranges
> > > may be used by devices, or the platform.
> > >
> > > Clarify and disambiguate this by making the boot log entry
> > > explicitly say 'kernel usable RAM':
> > >
> > > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
> >
> > Can't user space use that RAM?
> >
> > Shouldn't we rather refer to "OS usable RAM"?
>
> Or "System RAM", just like in /proc/iomem
Agreed - I have folded in the delta patch below.
Thanks,
Ingo
=============================>
arch/x86/kernel/e820.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 8b84261173cc..0a324d0db60e 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -187,7 +187,7 @@ void __init e820__range_add(u64 start, u64 size, enum e820_type type)
static void __init e820_print_type(enum e820_type type)
{
switch (type) {
- case E820_TYPE_RAM: pr_cont(" kernel usable RAM"); break;
+ case E820_TYPE_RAM: pr_cont(" System RAM"); break;
case E820_TYPE_RESERVED: pr_cont(" reserved"); break;
case E820_TYPE_SOFT_RESERVED: pr_cont(" soft reserved"); break;
case E820_TYPE_ACPI: pr_cont(" ACPI data"); break;
^ permalink raw reply related [flat|nested] 68+ messages in thread
* [PATCH 30/29] x86/boot/e820: Unify e820_print_type() and e820_type_to_string()
2025-04-22 7:06 ` Mike Rapoport
@ 2025-05-15 11:19 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 11:19 UTC (permalink / raw)
To: Mike Rapoport
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
* Mike Rapoport <rppt@kernel.org> wrote:
> On Mon, Apr 21, 2025 at 08:51:48PM +0200, Ingo Molnar wrote:
> > So it is a bit weird that the actual RAM entries of the E820 table
> > are not actually called RAM, but 'usable':
> >
> > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB usable
> >
> > 'usable' is pretty passive-aggressive in that context and ambiguous,
> > most E820 entries denote 'usable' address ranges - reserved ranges
> > may be used by devices, or the platform.
> >
> > Clarify and disambiguate this by making the boot log entry
> > explicitly say 'kernel usable RAM':
> >
> > BIOS-e820: [mem 0x0000000000100000-0x000000007ffdbfff] 1.9 GB kernel usable RAM
> >
> > Signed-off-by: Ingo Molnar <mingo@kernel.org>
> > Cc: Andy Shevchenko <andy@kernel.org>
> > Cc: Arnd Bergmann <arnd@kernel.org>
> > Cc: David Woodhouse <dwmw@amazon.co.uk>
> > Cc: H. Peter Anvin <hpa@zytor.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
> > ---
> > arch/x86/kernel/e820.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
> > index 8ee89962fcbf..99f997ae88dc 100644
> > --- a/arch/x86/kernel/e820.c
> > +++ b/arch/x86/kernel/e820.c
> > @@ -187,7 +187,7 @@ void __init e820__range_add(u64 start, u64 size, enum e820_type type)
> > static void __init e820_print_type(enum e820_type type)
> > {
> > switch (type) {
> > - case E820_TYPE_RAM: pr_cont(" usable"); break;
> > + case E820_TYPE_RAM: pr_cont(" kernel usable RAM"); break;
> > case E820_TYPE_RESERVED: pr_cont(" reserved"); break;
> > case E820_TYPE_SOFT_RESERVED: pr_cont(" soft reserved"); break;
> > case E820_TYPE_ACPI: pr_cont(" ACPI data"); break;
>
> We have e820_type_to_string(), IMO the whole switch here can be replaced by
>
> pr_cont(" %s", e820_type_to_string(type));
Yeah, agreed, but there's a few additional details:
- Your suggestion doesn't work as-is, because e820_type_to_string()
takes an 'entry' parameter, not 'type'.
- There's some difference in the messages, so I think this should be a
separate patch.
- Also, I think unified messages with the best of both sets of
messages is the best outcome, instead of just picking one side.
See these commits in the WIP.x86/e820 tree:
x86/boot/e820: Change e820_type_to_string() to take a 'type' parameter
x86/boot/e820: Unify e820_print_type() and e820_type_to_string()
With the more interesting one attached below.
Thanks,
Ingo
=================================>
From: Ingo Molnar <mingo@kernel.org>
Date: Thu, 15 May 2025 13:17:45 +0200
Subject: [PATCH] x86/boot/e820: Unify e820_print_type() and e820_type_to_string()
Use e820_type_to_string() to derive e820_print_type(),
and unify the messages:
- Don't Capitalize Words Within Sentences Randomly
- Use 'Device reserved' instead of 'Reserved'
Suggested-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
arch/x86/kernel/e820.c | 50 ++++++++++++++++++++------------------------------
1 file changed, 20 insertions(+), 30 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 3a86216ee05f..aadc46f3d074 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -68,6 +68,26 @@ unsigned long pci_mem_start = 0xaeedbabe;
EXPORT_SYMBOL(pci_mem_start);
#endif
+__init static const char * e820_type_to_string(enum e820_type type)
+{
+ switch (type) {
+ case E820_TYPE_RAM: return "System RAM";
+ case E820_TYPE_ACPI: return "ACPI tables";
+ case E820_TYPE_NVS: return "ACPI non-volatile storage";
+ case E820_TYPE_UNUSABLE: return "Unusable memory";
+ case E820_TYPE_PRAM: return "Persistent memory (legacy)";
+ case E820_TYPE_PMEM: return "Persistent memory";
+ case E820_TYPE_RESERVED: return "Device reserved";
+ case E820_TYPE_13: return "Type 13";
+ default: return "Unknown E820 type";
+ }
+}
+
+__init static void e820_print_type(enum e820_type type)
+{
+ pr_cont(" %s", e820_type_to_string(type));
+}
+
/*
* This function checks if any part of the range <start,end> is mapped
* with type.
@@ -186,21 +206,6 @@ __init void e820__range_add(u64 start, u64 size, enum e820_type type)
__e820__range_add(e820_table, start, size, type);
}
-__init static void e820_print_type(enum e820_type type)
-{
- switch (type) {
- case E820_TYPE_RAM: pr_cont(" System RAM"); break;
- case E820_TYPE_RESERVED: pr_cont(" device reserved"); break;
- case E820_TYPE_SOFT_RESERVED: pr_cont(" soft reserved"); break;
- case E820_TYPE_ACPI: pr_cont(" ACPI data"); break;
- case E820_TYPE_NVS: pr_cont(" ACPI NVS"); break;
- case E820_TYPE_UNUSABLE: pr_cont(" unusable"); break;
- case E820_TYPE_PMEM: /* Fall through: */
- case E820_TYPE_PRAM: pr_cont(" persistent RAM (type %u)", type); break;
- default: pr_cont(" type %u", type); break;
- }
-}
-
/*
* Print out the size of a E820 region, in human-readable
* fashion, going from KB, MB, GB to TB units.
@@ -1065,21 +1070,6 @@ __init void e820__finish_early_params(void)
}
}
-__init static const char * e820_type_to_string(enum e820_type type)
-{
- switch (type) {
- case E820_TYPE_RAM: return "System RAM";
- case E820_TYPE_ACPI: return "ACPI Tables";
- case E820_TYPE_NVS: return "ACPI Non-volatile Storage";
- case E820_TYPE_UNUSABLE: return "Unusable memory";
- case E820_TYPE_PRAM: return "Persistent Memory (legacy)";
- case E820_TYPE_PMEM: return "Persistent Memory";
- case E820_TYPE_RESERVED: return "Reserved";
- case E820_TYPE_13: return "Type 13";
- default: return "Unknown E820 type";
- }
-}
-
__init static unsigned long e820_type_to_iomem_type(struct e820_entry *entry)
{
switch (entry->type) {
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PATCH 10/29] x86/boot/e820: Use 'u64' consistently instead of 'unsigned long long'
2025-04-22 6:44 ` Andy Shevchenko
@ 2025-05-15 11:20 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 11:20 UTC (permalink / raw)
To: Andy Shevchenko
Cc: linux-kernel, Andy Shevchenko, Arnd Bergmann, Borislav Petkov,
Juergen Gross, H . Peter Anvin, Kees Cook, Linus Torvalds,
Mike Rapoport, Paul Menzel, Peter Zijlstra, Thomas Gleixner,
David Woodhouse
* Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
> Mon, Apr 21, 2025 at 08:51:50PM +0200, Ingo Molnar kirjoitti:
> > There's a number of structure fields and local variables related
> > to E820 entry physical addresses that are defined as 'unsigned long long',
> > but then are compared to u64 fields.
> >
> > Make the types all consistently u64.
>
> ...
>
> > + u64 start = e820_table->entries[i].addr;
> > + u64 end = start + e820_table->entries[i].size;
>
> Perhaps struct range as well?
I'm not opposed in principle, but that should be a separate patch or
series.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 14/29] x86/boot/e820: Clean up __e820__range_add() a bit
2025-04-22 10:10 ` Andy Shevchenko
@ 2025-05-15 11:21 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 11:21 UTC (permalink / raw)
To: Andy Shevchenko
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
* Andy Shevchenko <andy@kernel.org> wrote:
> On Mon, Apr 21, 2025 at 08:51:54PM +0200, Ingo Molnar wrote:
> > - Use 'idx' index variable instead of a weird 'x'
> > - Make the error message E820-specific
> > - Group the code a bit better
>
> ...
>
> > - if (x >= ARRAY_SIZE(table->entries)) {
> > - pr_err("too many entries; ignoring [mem %#010llx-%#010llx]\n",
> > - start, start + size - 1);
> > + if (idx >= ARRAY_SIZE(table->entries)) {
> > + pr_err("too many E820 table entries; ignoring [mem %#010llx-%#010llx]\n",
> > + start, start + size-1);
>
> size - 1 ?
Not in this case, see:
https://lore.kernel.org/r/aCW-qKOYJWRLYgpx@gmail.com
Thanks,
Ingo
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH 31/29] x86/boot/e820: Move index increments outside accessors in e820__update_table()
2025-04-22 10:23 ` Andy Shevchenko
@ 2025-05-15 11:32 ` Ingo Molnar
2025-05-15 11:37 ` [PATCH 17/29] x86/boot/e820: Standardize e820 table index variable names under 'idx' Ingo Molnar
1 sibling, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 11:32 UTC (permalink / raw)
To: Andy Shevchenko
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
* Andy Shevchenko <andy@kernel.org> wrote:
> On Mon, Apr 21, 2025 at 08:51:57PM +0200, Ingo Molnar wrote:
>
> ...
>
> > + int idx;
>
> > + u32 idx, chg_idx, chg_nr;
>
> What about sanitizing the type as well to be let's say unsigned int idx in all cases?
Not sure - I kind of like the brevity of 'u32' here, and it's also the
type of the originating ABI interface.
> ...
>
> > + change_point[chg_idx]->addr = entries[idx].addr;
> > + change_point[chg_idx++]->entry = &entries[idx];
> > + change_point[chg_idx]->addr = entries[idx].addr + entries[idx].size;
> > + change_point[chg_idx++]->entry = &entries[idx];
>
> Does GCC 15 not produce any warnings here? Linus recently complain on some code
> with index increment inside the accessor. Perhaps just
>
> change_point[chg_idx]->entry = &entries[idx];
> chg_idx++;
>
> ?
Yeah, good point - patch attached.
Thanks,
Ingo
===================================>
From: Ingo Molnar <mingo@kernel.org>
Date: Thu, 15 May 2025 13:26:46 +0200
Subject: [PATCH] x86/boot/e820: Move index increments outside accessors in e820__update_table()
This kind of code:
change_point[chg_idx++]->entry = &entries[idx];
Can be a bit confusing to human readers, and GCC-15 started
warning about these patterns.
Move the index increment outside the accessor.
Suggested-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/x86/kernel/e820.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index aadc46f3d074..b758c0632d0c 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -421,9 +421,11 @@ __init int e820__update_table(struct e820_table *table)
for (idx = 0; idx < table->nr_entries; idx++) {
if (entries[idx].size != 0) {
change_point[chg_idx]->addr = entries[idx].addr;
- change_point[chg_idx++]->entry = &entries[idx];
+ change_point[chg_idx]->entry = &entries[idx];
+ chg_idx++;
change_point[chg_idx]->addr = entries[idx].addr + entries[idx].size;
- change_point[chg_idx++]->entry = &entries[idx];
+ change_point[chg_idx]->entry = &entries[idx];
+ chg_idx++;
}
}
chg_nr = chg_idx;
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PATCH 17/29] x86/boot/e820: Standardize e820 table index variable names under 'idx'
2025-04-22 10:23 ` Andy Shevchenko
2025-05-15 11:32 ` [PATCH 31/29] x86/boot/e820: Move index increments outside accessors in e820__update_table() Ingo Molnar
@ 2025-05-15 11:37 ` Ingo Molnar
1 sibling, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 11:37 UTC (permalink / raw)
To: Andy Shevchenko
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
* Andy Shevchenko <andy@kernel.org> wrote:
> ...
>
> > + for (idx = 0; idx < overlap_entries; idx++) {
> > + if (overlap_list[idx] == change_point[chg_idx]->entry)
> > + overlap_list[idx] = overlap_list[overlap_entries-1];
>
> overlap_entries - 1 ?
See:
https://lore.kernel.org/r/aCW-qKOYJWRLYgpx@gmail.com
> ...
>
> > + while (--idx >= 0) {
>
> while (idx--) {
>
> should work as well, no?
Yeah, but note that this function gets thoroughly reworked in:
x86/boot/e820: Make sure e820_search_gap() finds all gaps
and the while loop is transformed to a straightforward for loop:
+ for (idx = 0; idx < e820_table->nr_entries; idx++) {
Thanks,
Ingo
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 19/29] x86/boot/e820: Standardize e820 table index variable types under 'u32'
2025-04-22 15:13 ` Andy Shevchenko
@ 2025-05-15 11:39 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 11:39 UTC (permalink / raw)
To: Andy Shevchenko
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
* Andy Shevchenko <andy@kernel.org> wrote:
> On Mon, Apr 21, 2025 at 08:51:59PM +0200, Ingo Molnar wrote:
> > So we have 'idx' types of 'int' and 'unsigned int', and sometimes
> > we assign 'u32' fields such as e820_table::nr_entries to these 'int'
> > values.
> >
> > While there's no real risk of overflow with these tables, make it
> > all cleaner by standardizing on a single type: u32.
> >
> > This also happens to shrink the code a bit:
> >
> > text data bss dec hex filename
> > 7745 44072 0 51817 ca69 e820.o.before
> > 7613 44072 0 51685 c9e5 e820.o.after
>
> Ah, here it is! You can ignore my respective comment in one of the previous
> patches. Perhaps better to group that one (which converts to use idx) and this
> one, so they will be sequential in the series?
Certainly, and done, the order is now:
x86/boot/e820: Standardize e820 table index variable names under 'idx'
x86/boot/e820: Standardize e820 table index variable types under 'u32'
x86/boot/e820: Change struct e820_table::nr_entries type from __u32 to u32
Thanks,
Ingo
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 20/29] x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit
2025-04-22 16:37 ` Andy Shevchenko
@ 2025-05-15 11:44 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 11:44 UTC (permalink / raw)
To: Andy Shevchenko
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
* Andy Shevchenko <andy@kernel.org> wrote:
> On Mon, Apr 21, 2025 at 08:52:00PM +0200, Ingo Molnar wrote:
> > Use a bit more readable variable names, we haven't run out of
> > underscore characters in the kernel yet.
>
> Size notes:
>
> > + gap_size = 0x400000;
>
> SZ_4M ?
>
> ...
>
> > + gap_start = (max_pfn << PAGE_SHIFT) + 1024*1024;
>
> SZ_1M ?
Good point - delta patch below folded back & propagated to the rest of
the series.
Thanks,
Ingo
===============================>
arch/x86/kernel/e820.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 3baf558107e7..d00ca2dd20b7 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -722,12 +722,12 @@ __init void e820__setup_pci_gap(void)
unsigned long gap_start, gap_size;
int found;
- gap_size = 0x400000;
+ gap_size = SZ_4M;
found = e820_search_gap(&gap_start, &gap_size);
if (!found) {
#ifdef CONFIG_X86_64
- gap_start = (max_pfn << PAGE_SHIFT) + 1024*1024;
+ gap_start = (max_pfn << PAGE_SHIFT) + SZ_1M;
pr_err("Cannot find an available gap in the 32-bit address range\n");
pr_err("PCI devices with unassigned 32-bit BARs may not work!\n");
#else
^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [PATCH 27/29] x86/boot/e820: Simplify the e820__range_remove() API
2025-04-22 16:41 ` Andy Shevchenko
@ 2025-05-15 11:49 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 11:49 UTC (permalink / raw)
To: Andy Shevchenko
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
* Andy Shevchenko <andy@kernel.org> wrote:
> On Mon, Apr 21, 2025 at 08:52:07PM +0200, Ingo Molnar wrote:
> > Right now e820__range_remove() has two parameters to control the
> > E820 type of the range removed:
> >
> > extern void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
> >
> > Since E820 types start at 1, zero has a natural meaning of 'no type.
> >
> > Consolidate the (old_type,check_type) parameters into a single (filter_type)
> > parameter:
> >
> > extern void e820__range_remove(u64 start, u64 size, enum e820_type filter_type);
> >
> > Note that both e820__mapped_raw_any() and e820__mapped_any()
> > already have such semantics for their 'type' parameter, although
> > it's currently not used with '0' by in-kernel code.
> >
> > Also, the __e820__mapped_all() internal helper already has such
> > semantics implemented as well, and the e820__get_entry_type() API
> > uses the '0' type to such effect.
> >
> > This simplifies not just e820__range_remove(), and synchronizes its
> > use of type filters with other E820 API functions, but simplifies
> > usage sites as well, such as parse_memmap_one(), beyond the reduction
> > of the number of parameters:
> >
> > - else if (from)
> > - e820__range_remove(start_at, mem_size, from, 1);
> > else
> > - e820__range_remove(start_at, mem_size, 0, 0);
> > + e820__range_remove(start_at, mem_size, from);
> >
> > The generated code gets smaller as well:
> >
> > add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-66 (-66)
> >
> > Function old new delta
> > parse_memopt 112 107 -5
> > efi_init 1048 1039 -9
> > setup_arch 2719 2709 -10
> > e820__range_remove 283 273 -10
> > parse_memmap_opt 559 527 -32
> >
> > Total: Before=22,675,600, After=22,675,534, chg -0.00%
>
> > extern void e820__range_add (u64 start, u64 size, enum e820_type type);
> > extern u64 e820__range_update(u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
> > -extern void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type);
> > +extern void e820__range_remove(u64 start, u64 size, enum e820_type filter_type);
> > extern u64 e820__range_update_table(struct e820_table *t, u64 start, u64 size, enum e820_type old_type, enum e820_type new_type);
> >
> > extern int e820__update_table(struct e820_table *table);
>
> Wondering if are going to get rid of 'extern' for the functions...
Symmetrical use of storage classes provide useful documentation:
- I kinda like the immediate visual reminder of 'extern' that these
are exported API functions and not a function definition or
something else.
- Just like 'static void ...' is an immediate visual reminder that the
following function definition is local scope, or 'static inline' in
a header is an immediate reminder that it's an inline API.
We use such symmetric taggint in other places: we don't write
'unsigned' instead of 'unsigned int', just because we can.
But no strong feelings either way, as long as it's consistent within
the subsystem. The wider kernel is certainly using both approaches.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups
2025-04-22 6:58 ` [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Arnd Bergmann
@ 2025-05-15 11:56 ` Ingo Molnar
0 siblings, 0 replies; 68+ messages in thread
From: Ingo Molnar @ 2025-05-15 11:56 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linux-kernel, Andy Shevchenko, Borislav Petkov, Juergen Gross,
H. Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner
* Arnd Bergmann <arnd@kernel.org> wrote:
> On Mon, Apr 21, 2025, at 20:51, Ingo Molnar wrote:
> >
> > - Assorted cleanups: type cleanups, simplifications, standardization
> > of coding patterns, etc.
>
> Since you are already looking at cleaning up the types and testing
> a lot, I wonder if you could make sure this also works for a 32-bit
> phys_addr_t when booting a 32-bit kernel with and without
> CONFIG_X86_PAE. In my recent cleanup series I originally
> changed phys_addr_t to 32 bit after removing CONFIG_HIGHMEM_64G,
> but this caused regressions, so it's still left as u64 even
> though it should not be needed any more.
Yeah, this series does boot fine with and without PAE+HIGHMEM4G in my
testing.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table
2025-05-15 10:28 ` Ingo Molnar
@ 2025-05-16 9:13 ` Andy Shevchenko
0 siblings, 0 replies; 68+ messages in thread
From: Andy Shevchenko @ 2025-05-16 9:13 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
On Thu, May 15, 2025 at 12:28:13PM +0200, Ingo Molnar wrote:
> * Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
> > Mon, Apr 21, 2025 at 08:51:45PM +0200, Ingo Molnar kirjoitti:
...
> > > + u64 range_start, range_end;
> >
> > struct range (from range.h) and...
>
> Yeah, using those primitives makes sense, but right now the e820 code
> isn't using them, and it's better to have similar & unified range
> handling code patterns.
>
> In principle I wouldn't be opposed to patches that convert the e820
> code to <linux/range.h> types.
Okay, perhaps a separate cleanup in the future. Not sure if I will have time,
but let's see...
...
> > > + if (range_start > range_end_prev) {
> > > + pr_info("%s: [gap %#018Lx-%#018Lx]\n",
> > > + who,
> > > + range_end_prev,
> > > + range_start-1);
> >
> > %pra
>
> This would be part of any <linux/range.h> conversion patches.
>
> > with who mentioned the "gap"?
>
> Not sure I understand?
With the range.h in place and the mentioned specifier, the above will be like
BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
BIOS-e820: [range 0x00000000000a0000-0x00000000000effff] gap
BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges
2025-05-15 10:44 ` Ingo Molnar
2025-05-15 10:50 ` Ingo Molnar
@ 2025-05-16 9:15 ` Andy Shevchenko
1 sibling, 0 replies; 68+ messages in thread
From: Andy Shevchenko @ 2025-05-16 9:15 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Arnd Bergmann, Borislav Petkov, Juergen Gross,
H . Peter Anvin, Kees Cook, Linus Torvalds, Mike Rapoport,
Paul Menzel, Peter Zijlstra, Thomas Gleixner, David Woodhouse
On Thu, May 15, 2025 at 12:44:06PM +0200, Ingo Molnar wrote:
> * Andy Shevchenko <andy.shevchenko@gmail.com> wrote:
> > Mon, Apr 21, 2025 at 08:51:47PM +0200, Ingo Molnar kirjoitti:
...
> > > + if (size & (SZ_1T-1))
> > > + pr_cont(" %4llu.%01llu TB", size/SZ_1T, 10*(size & (SZ_1T-1))/SZ_1T);
> > > + else
> > > + pr_cont(" %4llu TB", size/SZ_1T);
> > > +}
> >
> > Don't you want to use string_helpers.h provided API?
> > string_get_size().
>
> I don't think string_get_size() knows the fine distinction between:
>
> BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] 256 KB device reserved
>
> and:
>
> BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] 256.0 KB device reserved
>
> "256 KB" is exactly 256 KB, while "256.0 KB" denotes a value that is a
> bit larger than 256 KB but rounds down to 256 KB at 1 KB granularity.
>
> When reading platform boot logs it's useful to know when such values
> are exact, at a glance.
We can patch string_size() to print precise integers without fractional part.
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 68+ messages in thread
end of thread, other threads:[~2025-05-16 9:15 UTC | newest]
Thread overview: 68+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-21 18:51 [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Ingo Molnar
2025-04-21 18:51 ` [PATCH 01/29] x86/boot/e820: Remove inverted boolean logic from the e820_nomerge() function name, rename it to e820_type_mergeable() Ingo Molnar
2025-04-21 18:51 ` [PATCH 02/29] x86/boot/e820: Simplify e820__print_table() a bit Ingo Molnar
2025-04-22 5:56 ` Mike Rapoport
2025-05-15 10:15 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 03/29] x86/boot/e820: Simplify the PPro Erratum #50 workaround Ingo Molnar
2025-04-22 6:29 ` Andy Shevchenko
2025-05-15 10:20 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 04/29] x86/boot/e820: Mark e820__print_table() static Ingo Molnar
2025-04-21 18:51 ` [PATCH 05/29] x86/boot/e820: Print gaps in the E820 table Ingo Molnar
2025-04-22 6:00 ` Mike Rapoport
2025-04-22 6:26 ` Andy Shevchenko
2025-05-15 10:23 ` Ingo Molnar
2025-04-22 6:42 ` Andy Shevchenko
2025-05-15 10:28 ` Ingo Molnar
2025-05-16 9:13 ` Andy Shevchenko
2025-04-21 18:51 ` [PATCH 06/29] x86/boot/e820: Make the field separator space character part of e820_print_type() Ingo Molnar
2025-04-21 18:51 ` [PATCH 07/29] x86/boot/e820: Print out sizes of E820 memory ranges Ingo Molnar
2025-04-22 6:38 ` Andy Shevchenko
2025-05-15 10:44 ` Ingo Molnar
2025-05-15 10:50 ` Ingo Molnar
2025-05-16 9:15 ` Andy Shevchenko
2025-04-22 6:53 ` Mike Rapoport
2025-05-15 11:00 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 08/29] x86/boot/e820: Print E820_TYPE_RAM entries as ... RAM entries Ingo Molnar
2025-04-22 6:31 ` Andy Shevchenko
2025-04-22 6:43 ` Mike Rapoport
2025-05-15 11:04 ` Ingo Molnar
2025-04-22 7:06 ` Mike Rapoport
2025-05-15 11:19 ` [PATCH 30/29] x86/boot/e820: Unify e820_print_type() and e820_type_to_string() Ingo Molnar
2025-04-21 18:51 ` [PATCH 09/29] x86/boot/e820: Call the PCI gap a 'gap' in the boot log printout Ingo Molnar
2025-04-21 18:51 ` [PATCH 10/29] x86/boot/e820: Use 'u64' consistently instead of 'unsigned long long' Ingo Molnar
2025-04-22 6:44 ` Andy Shevchenko
2025-05-15 11:20 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 11/29] x86/boot/e820: Remove pointless early_panic() indirection Ingo Molnar
2025-04-21 18:51 ` [PATCH 12/29] x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations Ingo Molnar
2025-04-21 18:51 ` [PATCH 13/29] x86/boot/e820: Improve e820_print_type() messages Ingo Molnar
2025-04-21 18:51 ` [PATCH 14/29] x86/boot/e820: Clean up __e820__range_add() a bit Ingo Molnar
2025-04-22 10:10 ` Andy Shevchenko
2025-05-15 11:21 ` Ingo Molnar
2025-04-21 18:51 ` [PATCH 15/29] x86/boot/e820: Clean up __refdata use " Ingo Molnar
2025-04-21 18:51 ` [PATCH 16/29] x86/boot/e820: Remove unnecessary header inclusions Ingo Molnar
2025-04-21 18:51 ` [PATCH 17/29] x86/boot/e820: Standardize e820 table index variable names under 'idx' Ingo Molnar
2025-04-22 10:23 ` Andy Shevchenko
2025-05-15 11:32 ` [PATCH 31/29] x86/boot/e820: Move index increments outside accessors in e820__update_table() Ingo Molnar
2025-05-15 11:37 ` [PATCH 17/29] x86/boot/e820: Standardize e820 table index variable names under 'idx' Ingo Molnar
2025-04-21 18:51 ` [PATCH 18/29] x86/boot/e820: Change struct e820_table::nr_entries type from __u32 to u32 Ingo Molnar
2025-04-21 18:51 ` [PATCH 19/29] x86/boot/e820: Standardize e820 table index variable types under 'u32' Ingo Molnar
2025-04-22 15:13 ` Andy Shevchenko
2025-05-15 11:39 ` Ingo Molnar
2025-04-21 18:52 ` [PATCH 20/29] x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit Ingo Molnar
2025-04-22 16:37 ` Andy Shevchenko
2025-05-15 11:44 ` Ingo Molnar
2025-04-21 18:52 ` [PATCH 21/29] x86/boot/e820: Change e820_search_gap() to search for the highest-address PCI gap Ingo Molnar
2025-04-21 18:52 ` [PATCH 22/29] x86/boot/e820: Rename gap_start/gap_size to max_gap_start/max_gap_start in e820_search_gap() et al Ingo Molnar
2025-04-21 18:52 ` [PATCH 23/29] x86/boot/e820: Simplify & clarify __e820__range_add() a bit Ingo Molnar
2025-04-21 18:52 ` [PATCH 24/29] x86/boot/e820: Standardize __init/__initdata tag placement Ingo Molnar
2025-04-21 18:52 ` [PATCH 25/29] x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables Ingo Molnar
2025-04-21 18:52 ` [PATCH 26/29] x86/boot/e820: Remove e820__range_remove()'s unused return parameter Ingo Molnar
2025-04-21 18:52 ` [PATCH 27/29] x86/boot/e820: Simplify the e820__range_remove() API Ingo Molnar
2025-04-22 16:41 ` Andy Shevchenko
2025-05-15 11:49 ` Ingo Molnar
2025-04-21 18:52 ` [PATCH 28/29] x86/boot/e820: Make sure e820_search_gap() finds all gaps Ingo Molnar
2025-04-21 18:52 ` [PATCH 29/29] x86/boot/e820: Treat non-type-2 'reserved' E820 region types as E820_TYPE_RESERVED Ingo Molnar
2025-04-25 3:55 ` H. Peter Anvin
2025-05-15 10:04 ` [PATCH -v2 29/29] x86/boot/e820: Introduce E820_TYPE_13 and treat it as a device region Ingo Molnar
2025-04-22 6:58 ` [PATCH 00/29] x86/boot/e820: Assorted E820 table handling features and cleanups Arnd Bergmann
2025-05-15 11:56 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).