* [PATCH 0/4] x86/mm: some cleanups for pagetable setup code
@ 2025-10-03 16:56 Brendan Jackman
2025-10-03 16:56 ` [PATCH 1/4] x86/mm: delete disabled debug code Brendan Jackman
` (4 more replies)
0 siblings, 5 replies; 11+ messages in thread
From: Brendan Jackman @ 2025-10-03 16:56 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, Andy Lutomirski, Peter Zijlstra
Cc: linux-kernel, Brendan Jackman
Per discussion in [0] I'm looking for ways to refactor this code to make
ASI easier to deal with. But, while looking, I found some little things
that seem like just straightforward cleanups without any real
refactoring needed. So let's start there.
The last two patches are closely related and could potentially be
squashed, but I've split it in two because the first half is trivial,
the second is more tricky and likely to be wrong.
This applies to tip/master.
[0] https://lore.kernel.org/all/20250924-b4-asi-page-alloc-v1-0-2d861768041f@google.com/T/#t
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
Brendan Jackman (4):
x86/mm: delete disabled debug code
x86/mm: harmonize return value of phys_pte_init()
x86/mm: drop unused return from pgtable setup functions
x86/mm: simplify calculation of max_pfn_mapped
arch/x86/include/asm/pgtable.h | 3 +-
arch/x86/mm/init.c | 19 ++++----
arch/x86/mm/init_32.c | 5 +--
arch/x86/mm/init_64.c | 99 ++++++++++++++----------------------------
arch/x86/mm/mm_internal.h | 11 ++---
5 files changed, 48 insertions(+), 89 deletions(-)
---
base-commit: 47870f1fa057a411519108f0833dd2603179234f
change-id: 20251003-x86-init-cleanup-0ad754910bac
Best regards,
--
Brendan Jackman <jackmanb@google.com>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] x86/mm: delete disabled debug code
2025-10-03 16:56 [PATCH 0/4] x86/mm: some cleanups for pagetable setup code Brendan Jackman
@ 2025-10-03 16:56 ` Brendan Jackman
2025-11-27 13:39 ` [tip: x86/cleanups] x86/mm: Delete " tip-bot2 for Brendan Jackman
2025-10-03 16:56 ` [PATCH 2/4] x86/mm: harmonize return value of phys_pte_init() Brendan Jackman
` (3 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Brendan Jackman @ 2025-10-03 16:56 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, Andy Lutomirski, Peter Zijlstra
Cc: linux-kernel, Brendan Jackman
This code doesn't run. Since 2008 the kernel has gained more flexible
logging and tracing capabilities; presumably if anyone wanted to take
advantage of this log message they would have got rid of the
"if (0)" so they could use these capabilities.
Since they haven't, just delete it.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
arch/x86/mm/init_64.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index b9426fce5f3e3f17df57df7b12338f3c0ef4c288..9e45b371a6234b41bd7177b81b5d432341ae7214 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -504,9 +504,6 @@ phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
continue;
}
- if (0)
- pr_info(" pte=%p addr=%lx pte=%016lx\n", pte, paddr,
- pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL).pte);
pages++;
set_pte_init(pte, pfn_pte(paddr >> PAGE_SHIFT, prot), init);
paddr_last = (paddr & PAGE_MASK) + PAGE_SIZE;
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 2/4] x86/mm: harmonize return value of phys_pte_init()
2025-10-03 16:56 [PATCH 0/4] x86/mm: some cleanups for pagetable setup code Brendan Jackman
2025-10-03 16:56 ` [PATCH 1/4] x86/mm: delete disabled debug code Brendan Jackman
@ 2025-10-03 16:56 ` Brendan Jackman
2025-11-27 14:35 ` Borislav Petkov
2025-10-03 16:56 ` [PATCH 3/4] x86/mm: drop unused return from pgtable setup functions Brendan Jackman
` (2 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Brendan Jackman @ 2025-10-03 16:56 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, Andy Lutomirski, Peter Zijlstra
Cc: linux-kernel, Brendan Jackman
In the case that they encounter pre-existing mappings, all the other
phys_*_init()s include those pre-mapped PFNs in the returned value.
Excluding those PFNs only when they are mapped at 4K seems like an
error. So make it consistent.
The other functions only include the existing mappings if the
page_size_mask would have allowed creating those mappings.
4K pages can't be disabled by page_size_mask so that condition is not
needed here; paddr_last can be assigned unconditionally before checking
for existing mappings.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
arch/x86/mm/init_64.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 9e45b371a6234b41bd7177b81b5d432341ae7214..968a5092dbd7ee3e7007fa0c769eff7d7ecb0ba3 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -492,6 +492,8 @@ phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
continue;
}
+ paddr_last = paddr_next;
+
/*
* We will re-use the existing mapping.
* Xen for example has some special requirements, like mapping
@@ -506,7 +508,6 @@ phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
pages++;
set_pte_init(pte, pfn_pte(paddr >> PAGE_SHIFT, prot), init);
- paddr_last = (paddr & PAGE_MASK) + PAGE_SIZE;
}
update_page_count(PG_LEVEL_4K, pages);
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 3/4] x86/mm: drop unused return from pgtable setup functions
2025-10-03 16:56 [PATCH 0/4] x86/mm: some cleanups for pagetable setup code Brendan Jackman
2025-10-03 16:56 ` [PATCH 1/4] x86/mm: delete disabled debug code Brendan Jackman
2025-10-03 16:56 ` [PATCH 2/4] x86/mm: harmonize return value of phys_pte_init() Brendan Jackman
@ 2025-10-03 16:56 ` Brendan Jackman
2025-10-03 16:56 ` [PATCH 4/4] x86/mm: simplify calculation of max_pfn_mapped Brendan Jackman
2025-10-21 13:06 ` [PATCH 0/4] x86/mm: some cleanups for pagetable setup code Brendan Jackman
4 siblings, 0 replies; 11+ messages in thread
From: Brendan Jackman @ 2025-10-03 16:56 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, Andy Lutomirski, Peter Zijlstra
Cc: linux-kernel, Brendan Jackman
These functions return the last physical address that they mapped, but
none of their callers look at this value. Drop it.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
arch/x86/include/asm/pgtable.h | 3 +--
arch/x86/mm/init.c | 16 +++++++---------
arch/x86/mm/init_64.c | 7 +++----
arch/x86/mm/mm_internal.h | 5 ++---
4 files changed, 13 insertions(+), 18 deletions(-)
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index e33df3da698043aaa275f3f875bbf97ea8db5703..6fd789831b40dd7881a038589f5f898629b8c239 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1177,8 +1177,7 @@ extern int direct_gbpages;
void init_mem_mapping(void);
void early_alloc_pgt_buf(void);
void __init poking_init(void);
-unsigned long init_memory_mapping(unsigned long start,
- unsigned long end, pgprot_t prot);
+void init_memory_mapping(unsigned long start, unsigned long end, pgprot_t prot);
#ifdef CONFIG_X86_64
extern pgd_t trampoline_pgd_entry;
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index bb57e93b4caf16e4ceb4797bb6d5ecd2b38de7e6..d97e8407989c536078ee4419bbb94c21bc6abf4c 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -531,11 +531,11 @@ bool pfn_range_is_mapped(unsigned long start_pfn, unsigned long end_pfn)
* This runs before bootmem is initialized and gets pages directly from
* the physical memory. To access them they are temporarily mapped.
*/
-unsigned long __ref init_memory_mapping(unsigned long start,
- unsigned long end, pgprot_t prot)
+void __ref init_memory_mapping(unsigned long start,
+ unsigned long end, pgprot_t prot)
{
struct map_range mr[NR_RANGE_MR];
- unsigned long ret = 0;
+ unsigned long paddr_last = 0;
int nr_range, i;
pr_debug("init_memory_mapping: [mem %#010lx-%#010lx]\n",
@@ -545,13 +545,11 @@ unsigned long __ref init_memory_mapping(unsigned long start,
nr_range = split_mem_range(mr, 0, start, end);
for (i = 0; i < nr_range; i++)
- ret = kernel_physical_mapping_init(mr[i].start, mr[i].end,
- mr[i].page_size_mask,
- prot);
+ paddr_last = kernel_physical_mapping_init(mr[i].start, mr[i].end,
+ mr[i].page_size_mask,
+ prot);
- add_pfn_range_mapped(start >> PAGE_SHIFT, ret >> PAGE_SHIFT);
-
- return ret >> PAGE_SHIFT;
+ add_pfn_range_mapped(start >> PAGE_SHIFT, paddr_last >> PAGE_SHIFT);
}
/*
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 968a5092dbd7ee3e7007fa0c769eff7d7ecb0ba3..7462f813052ccd45f0199b98bd0ad6499a164f6f 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -810,14 +810,13 @@ kernel_physical_mapping_init(unsigned long paddr_start,
* when updating the mapping. The caller is responsible to flush the TLBs after
* the function returns.
*/
-unsigned long __meminit
+void __meminit
kernel_physical_mapping_change(unsigned long paddr_start,
unsigned long paddr_end,
unsigned long page_size_mask)
{
- return __kernel_physical_mapping_init(paddr_start, paddr_end,
- page_size_mask, PAGE_KERNEL,
- false);
+ __kernel_physical_mapping_init(paddr_start, paddr_end,
+ page_size_mask, PAGE_KERNEL, false);
}
#ifndef CONFIG_NUMA
diff --git a/arch/x86/mm/mm_internal.h b/arch/x86/mm/mm_internal.h
index 097aadc250f7442986cde998b17bab5bada85e3e..436396936dfbe5d48b46872628d25de317ae6ced 100644
--- a/arch/x86/mm/mm_internal.h
+++ b/arch/x86/mm/mm_internal.h
@@ -14,9 +14,8 @@ unsigned long kernel_physical_mapping_init(unsigned long start,
unsigned long end,
unsigned long page_size_mask,
pgprot_t prot);
-unsigned long kernel_physical_mapping_change(unsigned long start,
- unsigned long end,
- unsigned long page_size_mask);
+void kernel_physical_mapping_change(unsigned long start, unsigned long end,
+ unsigned long page_size_mask);
void zone_sizes_init(void);
extern int after_bootmem;
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 4/4] x86/mm: simplify calculation of max_pfn_mapped
2025-10-03 16:56 [PATCH 0/4] x86/mm: some cleanups for pagetable setup code Brendan Jackman
` (2 preceding siblings ...)
2025-10-03 16:56 ` [PATCH 3/4] x86/mm: drop unused return from pgtable setup functions Brendan Jackman
@ 2025-10-03 16:56 ` Brendan Jackman
2025-10-21 13:06 ` [PATCH 0/4] x86/mm: some cleanups for pagetable setup code Brendan Jackman
4 siblings, 0 replies; 11+ messages in thread
From: Brendan Jackman @ 2025-10-03 16:56 UTC (permalink / raw)
To: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, Andy Lutomirski, Peter Zijlstra
Cc: linux-kernel, Brendan Jackman
The phys_*_init()s return the "last physical address mapped". The exact
definition of this is pretty fiddly, but only when there is a mismatch
between the alignment of the requested range and the page sizes allowed
by page_size_mask, or when the range ends in a region that is not mapped
according to e820.
The only user that looks at the ultimate return value of this logic is
init_memory_mapping(), which doesn't fulfill those conditions; it's
calling kernel_physical_mapping_init() for ranges that exist, and
with the page_size_mask set according to the alignment of their edges.
In that case, the return value is just paddr_end. And the caller already
has that value, hence it can be dropped.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
arch/x86/mm/init.c | 11 +++---
arch/x86/mm/init_32.c | 5 +--
arch/x86/mm/init_64.c | 90 ++++++++++++++++-------------------------------
arch/x86/mm/mm_internal.h | 6 ++--
4 files changed, 39 insertions(+), 73 deletions(-)
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index d97e8407989c536078ee4419bbb94c21bc6abf4c..eb91f35410eec3b8298d04d867094d80a970387c 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -544,12 +544,13 @@ void __ref init_memory_mapping(unsigned long start,
memset(mr, 0, sizeof(mr));
nr_range = split_mem_range(mr, 0, start, end);
- for (i = 0; i < nr_range; i++)
- paddr_last = kernel_physical_mapping_init(mr[i].start, mr[i].end,
- mr[i].page_size_mask,
- prot);
+ for (i = 0; i < nr_range; i++) {
+ kernel_physical_mapping_init(mr[i].start, mr[i].end,
+ mr[i].page_size_mask, prot);
+ paddr_last = mr[i].end;
+ }
- add_pfn_range_mapped(start >> PAGE_SHIFT, paddr_last >> PAGE_SHIFT);
+ add_pfn_range_mapped(start >> PAGE_SHIFT, paddr_last);
}
/*
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 8a34fff6ab2b19f083f4fdf706de3ca0867416ba..b197736d90892b200002e4665e82f22125fa4bab 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -245,14 +245,13 @@ static inline int is_x86_32_kernel_text(unsigned long addr)
* of max_low_pfn pages, by creating page tables starting from address
* PAGE_OFFSET:
*/
-unsigned long __init
+void __init
kernel_physical_mapping_init(unsigned long start,
unsigned long end,
unsigned long page_size_mask,
pgprot_t prot)
{
int use_pse = page_size_mask == (1<<PG_LEVEL_2M);
- unsigned long last_map_addr = end;
unsigned long start_pfn, end_pfn;
pgd_t *pgd_base = swapper_pg_dir;
int pgd_idx, pmd_idx, pte_ofs;
@@ -356,7 +355,6 @@ kernel_physical_mapping_init(unsigned long start,
pages_4k++;
if (mapping_iter == 1) {
set_pte(pte, pfn_pte(pfn, init_prot));
- last_map_addr = (pfn << PAGE_SHIFT) + PAGE_SIZE;
} else
set_pte(pte, pfn_pte(pfn, prot));
}
@@ -382,7 +380,6 @@ kernel_physical_mapping_init(unsigned long start,
mapping_iter = 2;
goto repeat;
}
- return last_map_addr;
}
#ifdef CONFIG_HIGHMEM
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 7462f813052ccd45f0199b98bd0ad6499a164f6f..60f1a7493844ea399dd08dca50126f22a50d63d7 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -464,16 +464,12 @@ void __init cleanup_highmap(void)
}
}
-/*
- * Create PTE level page table mapping for physical addresses.
- * It returns the last physical address mapped.
- */
-static unsigned long __meminit
+/* Create PTE level page table mapping for physical addresses. */
+static void __meminit
phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
pgprot_t prot, bool init)
{
unsigned long pages = 0, paddr_next;
- unsigned long paddr_last = paddr_end;
pte_t *pte;
int i;
@@ -492,8 +488,6 @@ phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
continue;
}
- paddr_last = paddr_next;
-
/*
* We will re-use the existing mapping.
* Xen for example has some special requirements, like mapping
@@ -511,21 +505,17 @@ phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
}
update_page_count(PG_LEVEL_4K, pages);
-
- return paddr_last;
}
/*
* Create PMD level page table mapping for physical addresses. The virtual
* and physical address have to be aligned at this level.
- * It returns the last physical address mapped.
*/
-static unsigned long __meminit
+static void __meminit
phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
unsigned long page_size_mask, pgprot_t prot, bool init)
{
unsigned long pages = 0, paddr_next;
- unsigned long paddr_last = paddr_end;
int i = pmd_index(paddr);
@@ -549,9 +539,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
if (!pmd_leaf(*pmd)) {
spin_lock(&init_mm.page_table_lock);
pte = (pte_t *)pmd_page_vaddr(*pmd);
- paddr_last = phys_pte_init(pte, paddr,
- paddr_end, prot,
- init);
+ phys_pte_init(pte, paddr, paddr_end, prot, init);
spin_unlock(&init_mm.page_table_lock);
continue;
}
@@ -570,7 +558,6 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
if (page_size_mask & (1 << PG_LEVEL_2M)) {
if (!after_bootmem)
pages++;
- paddr_last = paddr_next;
continue;
}
new_prot = pte_pgprot(pte_clrhuge(*(pte_t *)pmd));
@@ -583,33 +570,29 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
pfn_pmd(paddr >> PAGE_SHIFT, prot_sethuge(prot)),
init);
spin_unlock(&init_mm.page_table_lock);
- paddr_last = paddr_next;
continue;
}
pte = alloc_low_page();
- paddr_last = phys_pte_init(pte, paddr, paddr_end, new_prot, init);
+ phys_pte_init(pte, paddr, paddr_end, new_prot, init);
spin_lock(&init_mm.page_table_lock);
pmd_populate_kernel_init(&init_mm, pmd, pte, init);
spin_unlock(&init_mm.page_table_lock);
}
update_page_count(PG_LEVEL_2M, pages);
- return paddr_last;
}
/*
* Create PUD level page table mapping for physical addresses. The virtual
* and physical address do not have to be aligned at this level. KASLR can
* randomize virtual addresses up to this level.
- * It returns the last physical address mapped.
*/
-static unsigned long __meminit
+static void __meminit
phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
unsigned long page_size_mask, pgprot_t _prot, bool init)
{
unsigned long pages = 0, paddr_next;
- unsigned long paddr_last = paddr_end;
unsigned long vaddr = (unsigned long)__va(paddr);
int i = pud_index(vaddr);
@@ -635,10 +618,8 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
if (!pud_none(*pud)) {
if (!pud_leaf(*pud)) {
pmd = pmd_offset(pud, 0);
- paddr_last = phys_pmd_init(pmd, paddr,
- paddr_end,
- page_size_mask,
- prot, init);
+ phys_pmd_init(pmd, paddr, paddr_end,
+ page_size_mask, prot, init);
continue;
}
/*
@@ -656,7 +637,6 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
if (page_size_mask & (1 << PG_LEVEL_1G)) {
if (!after_bootmem)
pages++;
- paddr_last = paddr_next;
continue;
}
prot = pte_pgprot(pte_clrhuge(*(pte_t *)pud));
@@ -669,13 +649,11 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
pfn_pud(paddr >> PAGE_SHIFT, prot_sethuge(prot)),
init);
spin_unlock(&init_mm.page_table_lock);
- paddr_last = paddr_next;
continue;
}
pmd = alloc_low_page();
- paddr_last = phys_pmd_init(pmd, paddr, paddr_end,
- page_size_mask, prot, init);
+ phys_pmd_init(pmd, paddr, paddr_end, page_size_mask, prot, init);
spin_lock(&init_mm.page_table_lock);
pud_populate_init(&init_mm, pud, pmd, init);
@@ -683,23 +661,22 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
}
update_page_count(PG_LEVEL_1G, pages);
-
- return paddr_last;
}
-static unsigned long __meminit
+static void __meminit
phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
unsigned long page_size_mask, pgprot_t prot, bool init)
{
- unsigned long vaddr, vaddr_end, vaddr_next, paddr_next, paddr_last;
+ unsigned long vaddr, vaddr_end, vaddr_next, paddr_next;
- paddr_last = paddr_end;
vaddr = (unsigned long)__va(paddr);
vaddr_end = (unsigned long)__va(paddr_end);
- if (!pgtable_l5_enabled())
- return phys_pud_init((pud_t *) p4d_page, paddr, paddr_end,
- page_size_mask, prot, init);
+ if (!pgtable_l5_enabled()) {
+ phys_pud_init((pud_t *) p4d_page, paddr, paddr_end,
+ page_size_mask, prot, init);
+ return;
+ }
for (; vaddr < vaddr_end; vaddr = vaddr_next) {
p4d_t *p4d = p4d_page + p4d_index(vaddr);
@@ -721,33 +698,30 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
if (!p4d_none(*p4d)) {
pud = pud_offset(p4d, 0);
- paddr_last = phys_pud_init(pud, paddr, __pa(vaddr_end),
- page_size_mask, prot, init);
+ phys_pud_init(pud, paddr, __pa(vaddr_end),
+ page_size_mask, prot, init);
continue;
}
pud = alloc_low_page();
- paddr_last = phys_pud_init(pud, paddr, __pa(vaddr_end),
- page_size_mask, prot, init);
+ phys_pud_init(pud, paddr, __pa(vaddr_end),
+ page_size_mask, prot, init);
spin_lock(&init_mm.page_table_lock);
p4d_populate_init(&init_mm, p4d, pud, init);
spin_unlock(&init_mm.page_table_lock);
}
-
- return paddr_last;
}
-static unsigned long __meminit
+static void __meminit
__kernel_physical_mapping_init(unsigned long paddr_start,
unsigned long paddr_end,
unsigned long page_size_mask,
pgprot_t prot, bool init)
{
bool pgd_changed = false;
- unsigned long vaddr, vaddr_start, vaddr_end, vaddr_next, paddr_last;
+ unsigned long vaddr, vaddr_start, vaddr_end, vaddr_next;
- paddr_last = paddr_end;
vaddr = (unsigned long)__va(paddr_start);
vaddr_end = (unsigned long)__va(paddr_end);
vaddr_start = vaddr;
@@ -760,16 +734,14 @@ __kernel_physical_mapping_init(unsigned long paddr_start,
if (pgd_val(*pgd)) {
p4d = (p4d_t *)pgd_page_vaddr(*pgd);
- paddr_last = phys_p4d_init(p4d, __pa(vaddr),
- __pa(vaddr_end),
- page_size_mask,
- prot, init);
+ phys_p4d_init(p4d, __pa(vaddr), __pa(vaddr_end),
+ page_size_mask, prot, init);
continue;
}
p4d = alloc_low_page();
- paddr_last = phys_p4d_init(p4d, __pa(vaddr), __pa(vaddr_end),
- page_size_mask, prot, init);
+ phys_p4d_init(p4d, __pa(vaddr), __pa(vaddr_end),
+ page_size_mask, prot, init);
spin_lock(&init_mm.page_table_lock);
if (pgtable_l5_enabled())
@@ -784,8 +756,6 @@ __kernel_physical_mapping_init(unsigned long paddr_start,
if (pgd_changed)
sync_global_pgds(vaddr_start, vaddr_end - 1);
-
- return paddr_last;
}
@@ -793,15 +763,15 @@ __kernel_physical_mapping_init(unsigned long paddr_start,
* Create page table mapping for the physical memory for specific physical
* addresses. Note that it can only be used to populate non-present entries.
* The virtual and physical addresses have to be aligned on PMD level
- * down. It returns the last physical address mapped.
+ * down.
*/
-unsigned long __meminit
+void __meminit
kernel_physical_mapping_init(unsigned long paddr_start,
unsigned long paddr_end,
unsigned long page_size_mask, pgprot_t prot)
{
- return __kernel_physical_mapping_init(paddr_start, paddr_end,
- page_size_mask, prot, true);
+ __kernel_physical_mapping_init(paddr_start, paddr_end,
+ page_size_mask, prot, true);
}
/*
diff --git a/arch/x86/mm/mm_internal.h b/arch/x86/mm/mm_internal.h
index 436396936dfbe5d48b46872628d25de317ae6ced..0fa6bbcb5ad21af6f1e4240eeb486f2f310ed39c 100644
--- a/arch/x86/mm/mm_internal.h
+++ b/arch/x86/mm/mm_internal.h
@@ -10,10 +10,8 @@ static inline void *alloc_low_page(void)
void early_ioremap_page_table_range_init(void);
-unsigned long kernel_physical_mapping_init(unsigned long start,
- unsigned long end,
- unsigned long page_size_mask,
- pgprot_t prot);
+void kernel_physical_mapping_init(unsigned long start, unsigned long end,
+ unsigned long page_size_mask, pgprot_t prot);
void kernel_physical_mapping_change(unsigned long start, unsigned long end,
unsigned long page_size_mask);
void zone_sizes_init(void);
--
2.50.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 0/4] x86/mm: some cleanups for pagetable setup code
2025-10-03 16:56 [PATCH 0/4] x86/mm: some cleanups for pagetable setup code Brendan Jackman
` (3 preceding siblings ...)
2025-10-03 16:56 ` [PATCH 4/4] x86/mm: simplify calculation of max_pfn_mapped Brendan Jackman
@ 2025-10-21 13:06 ` Brendan Jackman
4 siblings, 0 replies; 11+ messages in thread
From: Brendan Jackman @ 2025-10-21 13:06 UTC (permalink / raw)
To: Brendan Jackman, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Andy Lutomirski, Peter Zijlstra
Cc: linux-kernel
On Fri Oct 3, 2025 at 4:56 PM UTC, Brendan Jackman wrote:
> Per discussion in [0] I'm looking for ways to refactor this code to make
> ASI easier to deal with. But, while looking, I found some little things
> that seem like just straightforward cleanups without any real
> refactoring needed. So let's start there.
>
> The last two patches are closely related and could potentially be
> squashed, but I've split it in two because the first half is trivial,
> the second is more tricky and likely to be wrong.
>
> This applies to tip/master.
>
> [0] https://lore.kernel.org/all/20250924-b4-asi-page-alloc-v1-0-2d861768041f@google.com/T/#t
>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
> Brendan Jackman (4):
> x86/mm: delete disabled debug code
> x86/mm: harmonize return value of phys_pte_init()
> x86/mm: drop unused return from pgtable setup functions
> x86/mm: simplify calculation of max_pfn_mapped
>
> arch/x86/include/asm/pgtable.h | 3 +-
> arch/x86/mm/init.c | 19 ++++----
> arch/x86/mm/init_32.c | 5 +--
> arch/x86/mm/init_64.c | 99 ++++++++++++++----------------------------
> arch/x86/mm/mm_internal.h | 11 ++---
> 5 files changed, 48 insertions(+), 89 deletions(-)
> ---
> base-commit: 47870f1fa057a411519108f0833dd2603179234f
> change-id: 20251003-x86-init-cleanup-0ad754910bac
>
> Best regards,
Hey, is anyone able to take a look at this?
^ permalink raw reply [flat|nested] 11+ messages in thread
* [tip: x86/cleanups] x86/mm: Delete disabled debug code
2025-10-03 16:56 ` [PATCH 1/4] x86/mm: delete disabled debug code Brendan Jackman
@ 2025-11-27 13:39 ` tip-bot2 for Brendan Jackman
0 siblings, 0 replies; 11+ messages in thread
From: tip-bot2 for Brendan Jackman @ 2025-11-27 13:39 UTC (permalink / raw)
To: linux-tip-commits
Cc: Brendan Jackman, Borislav Petkov (AMD), x86, linux-kernel
The following commit has been merged into the x86/cleanups branch of tip:
Commit-ID: 3d1f1088455d9a9bce51f0c1e6a81f518a5cb468
Gitweb: https://git.kernel.org/tip/3d1f1088455d9a9bce51f0c1e6a81f518a5cb468
Author: Brendan Jackman <jackmanb@google.com>
AuthorDate: Fri, 03 Oct 2025 16:56:41
Committer: Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Thu, 27 Nov 2025 14:32:16 +01:00
x86/mm: Delete disabled debug code
This code doesn't run. Since 2008:
4f9c11dd49fb ("x86, 64-bit: adjust mapping of physical pagetables to work with Xen")
the kernel has gained more flexible logging and tracing capabilities;
presumably if anyone wanted to take advantage of this log message they would
have got rid of the "if (0)" so they could use these capabilities.
Since they haven't, just delete it.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20251003-x86-init-cleanup-v1-1-f2b7994c2ad6@google.com
---
arch/x86/mm/init_64.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 0e4270e..1044aaf 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -504,9 +504,6 @@ phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
continue;
}
- if (0)
- pr_info(" pte=%p addr=%lx pte=%016lx\n", pte, paddr,
- pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL).pte);
pages++;
set_pte_init(pte, pfn_pte(paddr >> PAGE_SHIFT, prot), init);
paddr_last = (paddr & PAGE_MASK) + PAGE_SIZE;
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] x86/mm: harmonize return value of phys_pte_init()
2025-10-03 16:56 ` [PATCH 2/4] x86/mm: harmonize return value of phys_pte_init() Brendan Jackman
@ 2025-11-27 14:35 ` Borislav Petkov
2025-11-28 14:03 ` Brendan Jackman
0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2025-11-27 14:35 UTC (permalink / raw)
To: Brendan Jackman
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin,
Andy Lutomirski, Peter Zijlstra, linux-kernel
On Fri, Oct 03, 2025 at 04:56:42PM +0000, Brendan Jackman wrote:
> In the case that they encounter pre-existing mappings, all the other
> phys_*_init()s include those pre-mapped PFNs in the returned value.
> Excluding those PFNs only when they are mapped at 4K seems like an
> error. So make it consistent.
>
> The other functions only include the existing mappings if the
> page_size_mask would have allowed creating those mappings.
> 4K pages can't be disabled by page_size_mask so that condition is not
> needed here; paddr_last can be assigned unconditionally before checking
> for existing mappings.
>
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
> arch/x86/mm/init_64.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 9e45b371a6234b41bd7177b81b5d432341ae7214..968a5092dbd7ee3e7007fa0c769eff7d7ecb0ba3 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -492,6 +492,8 @@ phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
> continue;
> }
>
> + paddr_last = paddr_next;
> +
> /*
> * We will re-use the existing mapping.
> * Xen for example has some special requirements, like mapping
I don't understand: the other phys_*_init() things do:
if (!XXX_none())
...
paddr_last = paddr_next;
while you've raised the assignment above that test.
Also "seems like an error" needs a lot more poking at because if it is an
error, then its incarnation must be really nasty and subtle or it is not, and
then we don't care. And it has been that way for a while now...
But maybe I'm not seeing it from the right angle...
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] x86/mm: harmonize return value of phys_pte_init()
2025-11-27 14:35 ` Borislav Petkov
@ 2025-11-28 14:03 ` Brendan Jackman
2025-12-05 19:29 ` Dave Hansen
0 siblings, 1 reply; 11+ messages in thread
From: Brendan Jackman @ 2025-11-28 14:03 UTC (permalink / raw)
To: Borislav Petkov, Brendan Jackman
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin,
Andy Lutomirski, Peter Zijlstra, linux-kernel
On Thu Nov 27, 2025 at 2:35 PM UTC, Borislav Petkov wrote:
> On Fri, Oct 03, 2025 at 04:56:42PM +0000, Brendan Jackman wrote:
>> In the case that they encounter pre-existing mappings, all the other
>> phys_*_init()s include those pre-mapped PFNs in the returned value.
>> Excluding those PFNs only when they are mapped at 4K seems like an
>> error. So make it consistent.
>>
>> The other functions only include the existing mappings if the
>> page_size_mask would have allowed creating those mappings.
>> 4K pages can't be disabled by page_size_mask so that condition is not
>> needed here; paddr_last can be assigned unconditionally before checking
>> for existing mappings.
>>
>> Signed-off-by: Brendan Jackman <jackmanb@google.com>
>> ---
>> arch/x86/mm/init_64.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
>> index 9e45b371a6234b41bd7177b81b5d432341ae7214..968a5092dbd7ee3e7007fa0c769eff7d7ecb0ba3 100644
>> --- a/arch/x86/mm/init_64.c
>> +++ b/arch/x86/mm/init_64.c
>> @@ -492,6 +492,8 @@ phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
>> continue;
>> }
>>
>> + paddr_last = paddr_next;
>> +
>> /*
>> * We will re-use the existing mapping.
>> * Xen for example has some special requirements, like mapping
>
> I don't understand: the other phys_*_init() things do:
>
> if (!XXX_none())
>
> ...
>
> paddr_last = paddr_next;
>
> while you've raised the assignment above that test.
Well they actually do this:
if (!p*_none()) {
if (!p*_leaf()) {
paddr_last = ...
continue;
}
if (page_size_mask & *) {
paddr_last = ...
continue;
}
}
if (page_size_mask & *) {
paddr_last = *
continue;
}
paddr_last = *
That is, they update paddr_last unconditionally. While before this
patch, phys_pte_init() skips the update in the !pte_non() case.
> Also "seems like an error" needs a lot more poking at because if it is an
> error, then its incarnation must be really nasty and subtle or it is not, and
> then we don't care. And it has been that way for a while now...
Before the patchset, the return value of kernel_physical_mapping_init()
means something like:
1. The last physical address that was mapped.
2. ... This includes addresses that were already mapped before the call
3. ... UNLESS that pre-existing mapping was 4K.
In patch 4/4 I'm claiming:
> The exact definition of this is pretty fiddly, but only when there is a mismatch
> between the alignment of the requested range and the page sizes allowed
> by page_size_mask, or when the range ends in a region that is not mapped
> according to e820.
Which would not be true given point 3 above. Without this
phys_pte_init() change, the return value of init_memory_mapping() is
fiddly even if you are allow arbitary page sizes and all the paddrs
you're trying to map definitely exist, because of the 4K special-case in
point 4. Instead of trying to justify why init_memory_mapping() doesn't
care even about that special-case, I just removed that special-case
because I think it was probably a bug anyway.
HOWEVER... with the wisdom of hindsight... this was a VERY obscure
and confusing way to go about writing the patchset. I apologise!
I think the right way to do this is to drop this patch (2/4) and
evaluate the remainder against the claim that init_memory_mapping()
doesn't care about the return value at all. So that would have to mean:
a. It only calls kernel_physical_mapping_init() for physical ranges that
exist.
b. It always uses a page_size_mask that matches the alignment of the
ranges it's passing.
c. It doesn't operate on ranges that already have mappings.
Am I making a bit more sense now...?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] x86/mm: harmonize return value of phys_pte_init()
2025-11-28 14:03 ` Brendan Jackman
@ 2025-12-05 19:29 ` Dave Hansen
2025-12-07 2:39 ` Brendan Jackman
0 siblings, 1 reply; 11+ messages in thread
From: Dave Hansen @ 2025-12-05 19:29 UTC (permalink / raw)
To: Brendan Jackman, Borislav Petkov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin,
Andy Lutomirski, Peter Zijlstra, linux-kernel
On 11/28/25 06:03, Brendan Jackman wrote:
> Before the patchset, the return value of kernel_physical_mapping_init()
> means something like:
>
> 1. The last physical address that was mapped.
>
> 2. ... This includes addresses that were already mapped before the call
>
> 3. ... UNLESS that pre-existing mapping was 4K.
Yeah, the 4k thing certainly sounds like a bug. The *only* thing that
this influences is the add_pfn_range_mapped() call and it doesn't care
about 4k.
> I think the right way to do this is to drop this patch (2/4) and
> evaluate the remainder against the claim that init_memory_mapping()
> doesn't care about the return value at all. So that would have to mean:
>
> a. It only calls kernel_physical_mapping_init() for physical ranges that
> exist.
>
> b. It always uses a page_size_mask that matches the alignment of the
> ranges it's passing.
>
> c. It doesn't operate on ranges that already have mappings.
Yeah, that makes sense to go forward with. Instead of having the code
try to cope with all that stuff that we don't think is happening
_anyway_, let's just warn on those conditions and effectively not handle
them.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] x86/mm: harmonize return value of phys_pte_init()
2025-12-05 19:29 ` Dave Hansen
@ 2025-12-07 2:39 ` Brendan Jackman
0 siblings, 0 replies; 11+ messages in thread
From: Brendan Jackman @ 2025-12-07 2:39 UTC (permalink / raw)
To: Dave Hansen, Brendan Jackman, Borislav Petkov
Cc: Thomas Gleixner, Ingo Molnar, Dave Hansen, x86, H. Peter Anvin,
Andy Lutomirski, Peter Zijlstra, linux-kernel
On Fri Dec 5, 2025 at 7:29 PM UTC, Dave Hansen wrote:
> On 11/28/25 06:03, Brendan Jackman wrote:
>> Before the patchset, the return value of kernel_physical_mapping_init()
>> means something like:
>>
>> 1. The last physical address that was mapped.
>>
>> 2. ... This includes addresses that were already mapped before the call
>>
>> 3. ... UNLESS that pre-existing mapping was 4K.
>
> Yeah, the 4k thing certainly sounds like a bug. The *only* thing that
> this influences is the add_pfn_range_mapped() call and it doesn't care
> about 4k.
>
>> I think the right way to do this is to drop this patch (2/4) and
>> evaluate the remainder against the claim that init_memory_mapping()
>> doesn't care about the return value at all. So that would have to mean:
>>
>> a. It only calls kernel_physical_mapping_init() for physical ranges that
>> exist.
>>
>> b. It always uses a page_size_mask that matches the alignment of the
>> ranges it's passing.
>>
>> c. It doesn't operate on ranges that already have mappings.
>
> Yeah, that makes sense to go forward with. Instead of having the code
> try to cope with all that stuff that we don't think is happening
> _anyway_, let's just warn on those conditions and effectively not handle
> them.
I assume those conditions can arise in other cases than
init_memory_mapping(). It's just that those cases already ignore the
return value so it doesn't matter anyway.
Anyway yeah will go ahead with this approach, minus the warnings.
Probably after LPC as I am still not finished with my page_alloc
stuff (yikes!).
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2025-12-07 2:39 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-03 16:56 [PATCH 0/4] x86/mm: some cleanups for pagetable setup code Brendan Jackman
2025-10-03 16:56 ` [PATCH 1/4] x86/mm: delete disabled debug code Brendan Jackman
2025-11-27 13:39 ` [tip: x86/cleanups] x86/mm: Delete " tip-bot2 for Brendan Jackman
2025-10-03 16:56 ` [PATCH 2/4] x86/mm: harmonize return value of phys_pte_init() Brendan Jackman
2025-11-27 14:35 ` Borislav Petkov
2025-11-28 14:03 ` Brendan Jackman
2025-12-05 19:29 ` Dave Hansen
2025-12-07 2:39 ` Brendan Jackman
2025-10-03 16:56 ` [PATCH 3/4] x86/mm: drop unused return from pgtable setup functions Brendan Jackman
2025-10-03 16:56 ` [PATCH 4/4] x86/mm: simplify calculation of max_pfn_mapped Brendan Jackman
2025-10-21 13:06 ` [PATCH 0/4] x86/mm: some cleanups for pagetable setup code Brendan Jackman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox