* [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation
@ 2026-02-17 13:34 Ryan Roberts
2026-02-17 13:34 ` [PATCH 6.6 1/3] arm64: mm: Don't remap pgtables per-cont(pte|pmd) block Ryan Roberts
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Ryan Roberts @ 2026-02-17 13:34 UTC (permalink / raw)
To: stable
Cc: Ryan Roberts, catalin.marinas, will, linux-arm-kernel,
linux-kernel, Jack Aboutboul, Sharath George John, Noah Meyerhans,
Jim Perrin
Hi All,
This series is a backport that applies to stable kernel 6.6 (base v6.6.126), for
some speed ups to enable significantly faster booting on systems with a lot of
memory. The patches were originally posted at:
https://lore.kernel.org/linux-arm-kernel/20240412131908.433043-1-ryan.roberts@arm.com/
... and were originally merged upstream in v6.10-rc1.
I'm requesting this be merged to stable on behalf of a partner who wants to get
the benefit of this series in Debian 12.
Thanks,
Ryan
Ryan Roberts (3):
arm64: mm: Don't remap pgtables per-cont(pte|pmd) block
arm64: mm: Batch dsb and isb when populating pgtables
arm64: mm: Don't remap pgtables for allocate vs populate
arch/arm64/include/asm/pgtable.h | 7 ++-
arch/arm64/mm/mmu.c | 92 ++++++++++++++++++--------------
2 files changed, 57 insertions(+), 42 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 6.6 1/3] arm64: mm: Don't remap pgtables per-cont(pte|pmd) block
2026-02-17 13:34 [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation Ryan Roberts
@ 2026-02-17 13:34 ` Ryan Roberts
2026-02-17 13:34 ` [PATCH 6.6 2/3] arm64: mm: Batch dsb and isb when populating pgtables Ryan Roberts
` (2 subsequent siblings)
3 siblings, 0 replies; 13+ messages in thread
From: Ryan Roberts @ 2026-02-17 13:34 UTC (permalink / raw)
To: stable
Cc: Ryan Roberts, catalin.marinas, will, linux-arm-kernel,
linux-kernel, Jack Aboutboul, Sharath George John, Noah Meyerhans,
Jim Perrin, Itaru Kitayama, Eric Chanudet, Mark Rutland,
Ard Biesheuvel
[ Upstream commit 5c63db59c5f89925add57642be4f789d0d671ccd ]
A large part of the kernel boot time is creating the kernel linear map
page tables. When rodata=full, all memory is mapped by pte. And when
there is lots of physical ram, there are lots of pte tables to populate.
The primary cost associated with this is mapping and unmapping the pte
table memory in the fixmap; at unmap time, the TLB entry must be
invalidated and this is expensive.
Previously, each pmd and pte table was fixmapped/fixunmapped for each
cont(pte|pmd) block of mappings (16 entries with 4K granule). This means
we ended up issuing 32 TLBIs per (pmd|pte) table during the population
phase.
Let's fix that, and fixmap/fixunmap each page once per population, for a
saving of 31 TLBIs per (pmd|pte) table. This gives a significant boot
speedup.
Execution time of map_mem(), which creates the kernel linear map page
tables, was measured on different machines with different RAM configs:
| Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra
| VM, 16G | VM, 64G | VM, 256G | Metal, 512G
---------------|-------------|-------------|-------------|-------------
| ms (%) | ms (%) | ms (%) | ms (%)
---------------|-------------|-------------|-------------|-------------
before | 168 (0%) | 2198 (0%) | 8644 (0%) | 17447 (0%)
after | 78 (-53%) | 435 (-80%) | 1723 (-80%) | 3779 (-78%)
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Tested-by: Eric Chanudet <echanude@redhat.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240412131908.433043-2-ryan.roberts@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
[ Ryan: Trivial backport ]
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/mm/mmu.c | 27 ++++++++++++++-------------
1 file changed, 14 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c8e83fe1cd5a7..130e915a3845e 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -169,12 +169,9 @@ bool pgattr_change_is_safe(u64 old, u64 new)
return ((old ^ new) & ~mask) == 0;
}
-static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end,
+static void init_pte(pte_t *ptep, unsigned long addr, unsigned long end,
phys_addr_t phys, pgprot_t prot)
{
- pte_t *ptep;
-
- ptep = pte_set_fixmap_offset(pmdp, addr);
do {
pte_t old_pte = READ_ONCE(*ptep);
@@ -189,8 +186,6 @@ static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end,
phys += PAGE_SIZE;
} while (ptep++, addr += PAGE_SIZE, addr != end);
-
- pte_clear_fixmap();
}
static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
@@ -201,6 +196,7 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
{
unsigned long next;
pmd_t pmd = READ_ONCE(*pmdp);
+ pte_t *ptep;
BUG_ON(pmd_sect(pmd));
if (pmd_none(pmd)) {
@@ -216,6 +212,7 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
}
BUG_ON(pmd_bad(pmd));
+ ptep = pte_set_fixmap_offset(pmdp, addr);
do {
pgprot_t __prot = prot;
@@ -226,20 +223,21 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
(flags & NO_CONT_MAPPINGS) == 0)
__prot = __pgprot(pgprot_val(prot) | PTE_CONT);
- init_pte(pmdp, addr, next, phys, __prot);
+ init_pte(ptep, addr, next, phys, __prot);
+ ptep += pte_index(next) - pte_index(addr);
phys += next - addr;
} while (addr = next, addr != end);
+
+ pte_clear_fixmap();
}
-static void init_pmd(pud_t *pudp, unsigned long addr, unsigned long end,
+static void init_pmd(pmd_t *pmdp, unsigned long addr, unsigned long end,
phys_addr_t phys, pgprot_t prot,
phys_addr_t (*pgtable_alloc)(int), int flags)
{
unsigned long next;
- pmd_t *pmdp;
- pmdp = pmd_set_fixmap_offset(pudp, addr);
do {
pmd_t old_pmd = READ_ONCE(*pmdp);
@@ -265,8 +263,6 @@ static void init_pmd(pud_t *pudp, unsigned long addr, unsigned long end,
}
phys += next - addr;
} while (pmdp++, addr = next, addr != end);
-
- pmd_clear_fixmap();
}
static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
@@ -276,6 +272,7 @@ static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
{
unsigned long next;
pud_t pud = READ_ONCE(*pudp);
+ pmd_t *pmdp;
/*
* Check for initial section mappings in the pgd/pud.
@@ -294,6 +291,7 @@ static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
}
BUG_ON(pud_bad(pud));
+ pmdp = pmd_set_fixmap_offset(pudp, addr);
do {
pgprot_t __prot = prot;
@@ -304,10 +302,13 @@ static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
(flags & NO_CONT_MAPPINGS) == 0)
__prot = __pgprot(pgprot_val(prot) | PTE_CONT);
- init_pmd(pudp, addr, next, phys, __prot, pgtable_alloc, flags);
+ init_pmd(pmdp, addr, next, phys, __prot, pgtable_alloc, flags);
+ pmdp += pmd_index(next) - pmd_index(addr);
phys += next - addr;
} while (addr = next, addr != end);
+
+ pmd_clear_fixmap();
}
static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 6.6 2/3] arm64: mm: Batch dsb and isb when populating pgtables
2026-02-17 13:34 [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation Ryan Roberts
2026-02-17 13:34 ` [PATCH 6.6 1/3] arm64: mm: Don't remap pgtables per-cont(pte|pmd) block Ryan Roberts
@ 2026-02-17 13:34 ` Ryan Roberts
2026-02-17 13:34 ` [PATCH 6.6 3/3] arm64: mm: Don't remap pgtables for allocate vs populate Ryan Roberts
2026-02-17 13:50 ` [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation Greg KH
3 siblings, 0 replies; 13+ messages in thread
From: Ryan Roberts @ 2026-02-17 13:34 UTC (permalink / raw)
To: stable
Cc: Ryan Roberts, catalin.marinas, will, linux-arm-kernel,
linux-kernel, Jack Aboutboul, Sharath George John, Noah Meyerhans,
Jim Perrin, Itaru Kitayama, Eric Chanudet, Mark Rutland,
Ard Biesheuvel
[ Upstream commit 1fcb7cea8a5f7747e02230f816c2c80b060d9517 ]
After removing uneccessary TLBIs, the next bottleneck when creating the
page tables for the linear map is DSB and ISB, which were previously
issued per-pte in __set_pte(). Since we are writing multiple ptes in a
given pte table, we can elide these barriers and insert them once we
have finished writing to the table.
Execution time of map_mem(), which creates the kernel linear map page
tables, was measured on different machines with different RAM configs:
| Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra
| VM, 16G | VM, 64G | VM, 256G | Metal, 512G
---------------|-------------|-------------|-------------|-------------
| ms (%) | ms (%) | ms (%) | ms (%)
---------------|-------------|-------------|-------------|-------------
before | 78 (0%) | 435 (0%) | 1723 (0%) | 3779 (0%)
after | 11 (-86%) | 161 (-63%) | 656 (-62%) | 1654 (-56%)
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Tested-by: Eric Chanudet <echanude@redhat.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240412131908.433043-3-ryan.roberts@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
[ Ryan: Trivial backport ]
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/include/asm/pgtable.h | 7 ++++++-
arch/arm64/mm/mmu.c | 11 ++++++++++-
2 files changed, 16 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 92e43b3a10df9..7350243a6a28d 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -262,9 +262,14 @@ static inline pte_t pte_mkdevmap(pte_t pte)
return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL));
}
-static inline void set_pte(pte_t *ptep, pte_t pte)
+static inline void set_pte_nosync(pte_t *ptep, pte_t pte)
{
WRITE_ONCE(*ptep, pte);
+}
+
+static inline void set_pte(pte_t *ptep, pte_t pte)
+{
+ set_pte_nosync(ptep, pte);
/*
* Only if the new pte is valid and kernel, otherwise TLB maintenance
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 130e915a3845e..c1dedd0b59f1b 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -175,7 +175,11 @@ static void init_pte(pte_t *ptep, unsigned long addr, unsigned long end,
do {
pte_t old_pte = READ_ONCE(*ptep);
- set_pte(ptep, pfn_pte(__phys_to_pfn(phys), prot));
+ /*
+ * Required barriers to make this visible to the table walker
+ * are deferred to the end of alloc_init_cont_pte().
+ */
+ set_pte_nosync(ptep, pfn_pte(__phys_to_pfn(phys), prot));
/*
* After the PTE entry has been populated once, we
@@ -229,6 +233,11 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
phys += next - addr;
} while (addr = next, addr != end);
+ /*
+ * Note: barriers and maintenance necessary to clear the fixmap slot
+ * ensure that all previous pgtable writes are visible to the table
+ * walker.
+ */
pte_clear_fixmap();
}
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 6.6 3/3] arm64: mm: Don't remap pgtables for allocate vs populate
2026-02-17 13:34 [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation Ryan Roberts
2026-02-17 13:34 ` [PATCH 6.6 1/3] arm64: mm: Don't remap pgtables per-cont(pte|pmd) block Ryan Roberts
2026-02-17 13:34 ` [PATCH 6.6 2/3] arm64: mm: Batch dsb and isb when populating pgtables Ryan Roberts
@ 2026-02-17 13:34 ` Ryan Roberts
2026-02-17 13:50 ` [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation Greg KH
3 siblings, 0 replies; 13+ messages in thread
From: Ryan Roberts @ 2026-02-17 13:34 UTC (permalink / raw)
To: stable
Cc: Ryan Roberts, catalin.marinas, will, linux-arm-kernel,
linux-kernel, Jack Aboutboul, Sharath George John, Noah Meyerhans,
Jim Perrin, Mark Rutland, Itaru Kitayama, Eric Chanudet,
Ard Biesheuvel
[ Upstream commit 0e9df1c905d8293d333ace86c13d147382f5caf9 ]
During linear map pgtable creation, each pgtable is fixmapped /
fixunmapped twice; once during allocation to zero the memory, and a
again during population to write the entries. This means each table has
2 TLB invalidations issued against it. Let's fix this so that each table
is only fixmapped/fixunmapped once, halving the number of TLBIs, and
improving performance.
Achieve this by separating allocation and initialization (zeroing) of
the page. The allocated page is now fixmapped directly by the walker and
initialized, before being populated and finally fixunmapped.
This approach keeps the change small, but has the side effect that late
allocations (using __get_free_page()) must also go through the generic
memory clearing routine. So let's tell __get_free_page() not to zero the
memory to avoid duplication.
Additionally this approach means that fixmap/fixunmap is still used for
late pgtable modifications. That's not technically needed since the
memory is all mapped in the linear map by that point. That's left as a
possible future optimization if found to be needed.
Execution time of map_mem(), which creates the kernel linear map page
tables, was measured on different machines with different RAM configs:
| Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra
| VM, 16G | VM, 64G | VM, 256G | Metal, 512G
---------------|-------------|-------------|-------------|-------------
| ms (%) | ms (%) | ms (%) | ms (%)
---------------|-------------|-------------|-------------|-------------
before | 11 (0%) | 161 (0%) | 656 (0%) | 1654 (0%)
after | 10 (-11%) | 104 (-35%) | 438 (-33%) | 1223 (-26%)
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Tested-by: Eric Chanudet <echanude@redhat.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240412131908.433043-4-ryan.roberts@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
[ Ryan: Trivial backport ]
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
arch/arm64/mm/mmu.c | 58 ++++++++++++++++++++++-----------------------
1 file changed, 29 insertions(+), 29 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c1dedd0b59f1b..d6411f7f0b72c 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -106,28 +106,12 @@ EXPORT_SYMBOL(phys_mem_access_prot);
static phys_addr_t __init early_pgtable_alloc(int shift)
{
phys_addr_t phys;
- void *ptr;
phys = memblock_phys_alloc_range(PAGE_SIZE, PAGE_SIZE, 0,
MEMBLOCK_ALLOC_NOLEAKTRACE);
if (!phys)
panic("Failed to allocate page table page\n");
- /*
- * The FIX_{PGD,PUD,PMD} slots may be in active use, but the FIX_PTE
- * slot will be free, so we can (ab)use the FIX_PTE slot to initialise
- * any level of table.
- */
- ptr = pte_set_fixmap(phys);
-
- memset(ptr, 0, PAGE_SIZE);
-
- /*
- * Implicit barriers also ensure the zeroed page is visible to the page
- * table walker
- */
- pte_clear_fixmap();
-
return phys;
}
@@ -169,6 +153,14 @@ bool pgattr_change_is_safe(u64 old, u64 new)
return ((old ^ new) & ~mask) == 0;
}
+static void init_clear_pgtable(void *table)
+{
+ clear_page(table);
+
+ /* Ensure the zeroing is observed by page table walks. */
+ dsb(ishst);
+}
+
static void init_pte(pte_t *ptep, unsigned long addr, unsigned long end,
phys_addr_t phys, pgprot_t prot)
{
@@ -211,12 +203,15 @@ static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
pmdval |= PMD_TABLE_PXN;
BUG_ON(!pgtable_alloc);
pte_phys = pgtable_alloc(PAGE_SHIFT);
+ ptep = pte_set_fixmap(pte_phys);
+ init_clear_pgtable(ptep);
+ ptep += pte_index(addr);
__pmd_populate(pmdp, pte_phys, pmdval);
- pmd = READ_ONCE(*pmdp);
+ } else {
+ BUG_ON(pmd_bad(pmd));
+ ptep = pte_set_fixmap_offset(pmdp, addr);
}
- BUG_ON(pmd_bad(pmd));
- ptep = pte_set_fixmap_offset(pmdp, addr);
do {
pgprot_t __prot = prot;
@@ -295,12 +290,15 @@ static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
pudval |= PUD_TABLE_PXN;
BUG_ON(!pgtable_alloc);
pmd_phys = pgtable_alloc(PMD_SHIFT);
+ pmdp = pmd_set_fixmap(pmd_phys);
+ init_clear_pgtable(pmdp);
+ pmdp += pmd_index(addr);
__pud_populate(pudp, pmd_phys, pudval);
- pud = READ_ONCE(*pudp);
+ } else {
+ BUG_ON(pud_bad(pud));
+ pmdp = pmd_set_fixmap_offset(pudp, addr);
}
- BUG_ON(pud_bad(pud));
- pmdp = pmd_set_fixmap_offset(pudp, addr);
do {
pgprot_t __prot = prot;
@@ -338,12 +336,15 @@ static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,
p4dval |= P4D_TABLE_PXN;
BUG_ON(!pgtable_alloc);
pud_phys = pgtable_alloc(PUD_SHIFT);
+ pudp = pud_set_fixmap(pud_phys);
+ init_clear_pgtable(pudp);
+ pudp += pud_index(addr);
__p4d_populate(p4dp, pud_phys, p4dval);
- p4d = READ_ONCE(*p4dp);
+ } else {
+ BUG_ON(p4d_bad(p4d));
+ pudp = pud_set_fixmap_offset(p4dp, addr);
}
- BUG_ON(p4d_bad(p4d));
- pudp = pud_set_fixmap_offset(p4dp, addr);
do {
pud_t old_pud = READ_ONCE(*pudp);
@@ -425,11 +426,10 @@ void create_kpti_ng_temp_pgd(pgd_t *pgdir, phys_addr_t phys, unsigned long virt,
static phys_addr_t __pgd_pgtable_alloc(int shift)
{
- void *ptr = (void *)__get_free_page(GFP_PGTABLE_KERNEL);
- BUG_ON(!ptr);
+ /* Page is zeroed by init_clear_pgtable() so don't duplicate effort. */
+ void *ptr = (void *)__get_free_page(GFP_PGTABLE_KERNEL & ~__GFP_ZERO);
- /* Ensure the zeroed page is visible to the page table walker */
- dsb(ishst);
+ BUG_ON(!ptr);
return __pa(ptr);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation
2026-02-17 13:34 [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation Ryan Roberts
` (2 preceding siblings ...)
2026-02-17 13:34 ` [PATCH 6.6 3/3] arm64: mm: Don't remap pgtables for allocate vs populate Ryan Roberts
@ 2026-02-17 13:50 ` Greg KH
2026-02-17 13:58 ` Ryan Roberts
3 siblings, 1 reply; 13+ messages in thread
From: Greg KH @ 2026-02-17 13:50 UTC (permalink / raw)
To: Ryan Roberts
Cc: stable, catalin.marinas, will, linux-arm-kernel, linux-kernel,
Jack Aboutboul, Sharath George John, Noah Meyerhans, Jim Perrin
On Tue, Feb 17, 2026 at 01:34:05PM +0000, Ryan Roberts wrote:
> Hi All,
>
> This series is a backport that applies to stable kernel 6.6 (base v6.6.126), for
> some speed ups to enable significantly faster booting on systems with a lot of
> memory. The patches were originally posted at:
>
> https://lore.kernel.org/linux-arm-kernel/20240412131908.433043-1-ryan.roberts@arm.com/
>
> ... and were originally merged upstream in v6.10-rc1.
>
> I'm requesting this be merged to stable on behalf of a partner who wants to get
> the benefit of this series in Debian 12.
Why can't they just use a newer kernel version (i.e. 6.12)? Surely they
would be able to justify moving to a newer kernel for performance
reasons, why enable them to stay on an older one, just delaying the
inevitable upgrade they will have to do anyway in a year or so?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation
2026-02-17 13:50 ` [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation Greg KH
@ 2026-02-17 13:58 ` Ryan Roberts
2026-02-17 14:10 ` Greg KH
0 siblings, 1 reply; 13+ messages in thread
From: Ryan Roberts @ 2026-02-17 13:58 UTC (permalink / raw)
To: Greg KH
Cc: stable, catalin.marinas, will, linux-arm-kernel, linux-kernel,
Jack Aboutboul, Sharath George John, Noah Meyerhans, Jim Perrin
On 17/02/2026 13:50, Greg KH wrote:
> On Tue, Feb 17, 2026 at 01:34:05PM +0000, Ryan Roberts wrote:
>> Hi All,
>>
>> This series is a backport that applies to stable kernel 6.6 (base v6.6.126), for
>> some speed ups to enable significantly faster booting on systems with a lot of
>> memory. The patches were originally posted at:
>>
>> https://lore.kernel.org/linux-arm-kernel/20240412131908.433043-1-ryan.roberts@arm.com/
>>
>> ... and were originally merged upstream in v6.10-rc1.
>>
>> I'm requesting this be merged to stable on behalf of a partner who wants to get
>> the benefit of this series in Debian 12.
>
> Why can't they just use a newer kernel version (i.e. 6.12)? Surely they
> would be able to justify moving to a newer kernel for performance
> reasons, why enable them to stay on an older one, just delaying the
> inevitable upgrade they will have to do anyway in a year or so?
I can't answer this presicely, but I did ask and push for that approach. As I
understand it, they are stuck with Debian 12, which is stuck with kernel 6.1.
The Debian maintainer apparently requested that these go through stable in order
to get them into Debian 12.
Thanks,
Ryan
>
> thanks,
>
> greg k-h
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation
2026-02-17 13:58 ` Ryan Roberts
@ 2026-02-17 14:10 ` Greg KH
2026-02-17 14:21 ` Ryan Roberts
0 siblings, 1 reply; 13+ messages in thread
From: Greg KH @ 2026-02-17 14:10 UTC (permalink / raw)
To: Ryan Roberts
Cc: stable, catalin.marinas, will, linux-arm-kernel, linux-kernel,
Jack Aboutboul, Sharath George John, Noah Meyerhans, Jim Perrin
On Tue, Feb 17, 2026 at 01:58:36PM +0000, Ryan Roberts wrote:
> On 17/02/2026 13:50, Greg KH wrote:
> > On Tue, Feb 17, 2026 at 01:34:05PM +0000, Ryan Roberts wrote:
> >> Hi All,
> >>
> >> This series is a backport that applies to stable kernel 6.6 (base v6.6.126), for
> >> some speed ups to enable significantly faster booting on systems with a lot of
> >> memory. The patches were originally posted at:
> >>
> >> https://lore.kernel.org/linux-arm-kernel/20240412131908.433043-1-ryan.roberts@arm.com/
> >>
> >> ... and were originally merged upstream in v6.10-rc1.
> >>
> >> I'm requesting this be merged to stable on behalf of a partner who wants to get
> >> the benefit of this series in Debian 12.
> >
> > Why can't they just use a newer kernel version (i.e. 6.12)? Surely they
> > would be able to justify moving to a newer kernel for performance
> > reasons, why enable them to stay on an older one, just delaying the
> > inevitable upgrade they will have to do anyway in a year or so?
>
> I can't answer this presicely, but I did ask and push for that approach. As I
> understand it, they are stuck with Debian 12, which is stuck with kernel 6.1.
> The Debian maintainer apparently requested that these go through stable in order
> to get them into Debian 12.
I understand the position of Debian not wanting to take patches for new
features that are not already upstream, but really, Debian offers a
newer kernel for hardware that wants to use it for things like this,
right? Why not just use that instead?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation
2026-02-17 14:10 ` Greg KH
@ 2026-02-17 14:21 ` Ryan Roberts
2026-02-17 14:26 ` Greg KH
2026-02-17 14:27 ` Chen-Yu Tsai
0 siblings, 2 replies; 13+ messages in thread
From: Ryan Roberts @ 2026-02-17 14:21 UTC (permalink / raw)
To: Greg KH
Cc: stable, catalin.marinas, will, linux-arm-kernel, linux-kernel,
Jack Aboutboul, Sharath George John, Noah Meyerhans, Jim Perrin
On 17/02/2026 14:10, Greg KH wrote:
> On Tue, Feb 17, 2026 at 01:58:36PM +0000, Ryan Roberts wrote:
>> On 17/02/2026 13:50, Greg KH wrote:
>>> On Tue, Feb 17, 2026 at 01:34:05PM +0000, Ryan Roberts wrote:
>>>> Hi All,
>>>>
>>>> This series is a backport that applies to stable kernel 6.6 (base v6.6.126), for
>>>> some speed ups to enable significantly faster booting on systems with a lot of
>>>> memory. The patches were originally posted at:
>>>>
>>>> https://lore.kernel.org/linux-arm-kernel/20240412131908.433043-1-ryan.roberts@arm.com/
>>>>
>>>> ... and were originally merged upstream in v6.10-rc1.
>>>>
>>>> I'm requesting this be merged to stable on behalf of a partner who wants to get
>>>> the benefit of this series in Debian 12.
>>>
>>> Why can't they just use a newer kernel version (i.e. 6.12)? Surely they
>>> would be able to justify moving to a newer kernel for performance
>>> reasons, why enable them to stay on an older one, just delaying the
>>> inevitable upgrade they will have to do anyway in a year or so?
>>
>> I can't answer this presicely, but I did ask and push for that approach. As I
>> understand it, they are stuck with Debian 12, which is stuck with kernel 6.1.
>> The Debian maintainer apparently requested that these go through stable in order
>> to get them into Debian 12.
>
> I understand the position of Debian not wanting to take patches for new
> features that are not already upstream, but really, Debian offers a
> newer kernel for hardware that wants to use it for things like this,
> right? Why not just use that instead?
Let me go push a bit harder. But I expect we are in the grey zone between bug
and feature here; this is a performance bug fix, not a new feature. By
selectively backporting I'm guessing they are avoiding the risk of new features
that a new kernel brings introducing new bugs? I'm guessing there is a higher
qualification bar for that.
>
> thanks,
>
> greg k-h
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation
2026-02-17 14:21 ` Ryan Roberts
@ 2026-02-17 14:26 ` Greg KH
2026-02-17 14:43 ` Ryan Roberts
2026-02-17 14:27 ` Chen-Yu Tsai
1 sibling, 1 reply; 13+ messages in thread
From: Greg KH @ 2026-02-17 14:26 UTC (permalink / raw)
To: Ryan Roberts
Cc: stable, catalin.marinas, will, linux-arm-kernel, linux-kernel,
Jack Aboutboul, Sharath George John, Noah Meyerhans, Jim Perrin
On Tue, Feb 17, 2026 at 02:21:30PM +0000, Ryan Roberts wrote:
> On 17/02/2026 14:10, Greg KH wrote:
> > On Tue, Feb 17, 2026 at 01:58:36PM +0000, Ryan Roberts wrote:
> >> On 17/02/2026 13:50, Greg KH wrote:
> >>> On Tue, Feb 17, 2026 at 01:34:05PM +0000, Ryan Roberts wrote:
> >>>> Hi All,
> >>>>
> >>>> This series is a backport that applies to stable kernel 6.6 (base v6.6.126), for
> >>>> some speed ups to enable significantly faster booting on systems with a lot of
> >>>> memory. The patches were originally posted at:
> >>>>
> >>>> https://lore.kernel.org/linux-arm-kernel/20240412131908.433043-1-ryan.roberts@arm.com/
> >>>>
> >>>> ... and were originally merged upstream in v6.10-rc1.
> >>>>
> >>>> I'm requesting this be merged to stable on behalf of a partner who wants to get
> >>>> the benefit of this series in Debian 12.
> >>>
> >>> Why can't they just use a newer kernel version (i.e. 6.12)? Surely they
> >>> would be able to justify moving to a newer kernel for performance
> >>> reasons, why enable them to stay on an older one, just delaying the
> >>> inevitable upgrade they will have to do anyway in a year or so?
> >>
> >> I can't answer this presicely, but I did ask and push for that approach. As I
> >> understand it, they are stuck with Debian 12, which is stuck with kernel 6.1.
> >> The Debian maintainer apparently requested that these go through stable in order
> >> to get them into Debian 12.
> >
> > I understand the position of Debian not wanting to take patches for new
> > features that are not already upstream, but really, Debian offers a
> > newer kernel for hardware that wants to use it for things like this,
> > right? Why not just use that instead?
>
> Let me go push a bit harder. But I expect we are in the grey zone between bug
> and feature here; this is a performance bug fix, not a new feature. By
> selectively backporting I'm guessing they are avoiding the risk of new features
> that a new kernel brings introducing new bugs? I'm guessing there is a higher
> qualification bar for that.
That's a broken "qualification system" if that is the case, given that
the patches that flow back into stable kernel releases should be
triggering "full qualification" if anyone actually paid attention to
what goes into there :)
Anyway, good luck! And same for 6.1.y, if they are ok with 6.6.y, why
would they even care about 6.1.y?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation
2026-02-17 14:21 ` Ryan Roberts
2026-02-17 14:26 ` Greg KH
@ 2026-02-17 14:27 ` Chen-Yu Tsai
2026-02-18 19:49 ` Noah Meyerhans
1 sibling, 1 reply; 13+ messages in thread
From: Chen-Yu Tsai @ 2026-02-17 14:27 UTC (permalink / raw)
To: Ryan Roberts
Cc: Greg KH, stable, catalin.marinas, will, linux-arm-kernel,
linux-kernel, Jack Aboutboul, Sharath George John, Noah Meyerhans,
Jim Perrin
On Tue, Feb 17, 2026 at 10:21 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 17/02/2026 14:10, Greg KH wrote:
> > On Tue, Feb 17, 2026 at 01:58:36PM +0000, Ryan Roberts wrote:
> >> On 17/02/2026 13:50, Greg KH wrote:
> >>> On Tue, Feb 17, 2026 at 01:34:05PM +0000, Ryan Roberts wrote:
> >>>> Hi All,
> >>>>
> >>>> This series is a backport that applies to stable kernel 6.6 (base v6.6.126), for
> >>>> some speed ups to enable significantly faster booting on systems with a lot of
> >>>> memory. The patches were originally posted at:
> >>>>
> >>>> https://lore.kernel.org/linux-arm-kernel/20240412131908.433043-1-ryan.roberts@arm.com/
> >>>>
> >>>> ... and were originally merged upstream in v6.10-rc1.
> >>>>
> >>>> I'm requesting this be merged to stable on behalf of a partner who wants to get
> >>>> the benefit of this series in Debian 12.
> >>>
> >>> Why can't they just use a newer kernel version (i.e. 6.12)? Surely they
> >>> would be able to justify moving to a newer kernel for performance
> >>> reasons, why enable them to stay on an older one, just delaying the
> >>> inevitable upgrade they will have to do anyway in a year or so?
> >>
> >> I can't answer this presicely, but I did ask and push for that approach. As I
> >> understand it, they are stuck with Debian 12, which is stuck with kernel 6.1.
> >> The Debian maintainer apparently requested that these go through stable in order
> >> to get them into Debian 12.
> >
> > I understand the position of Debian not wanting to take patches for new
> > features that are not already upstream, but really, Debian offers a
> > newer kernel for hardware that wants to use it for things like this,
> > right? Why not just use that instead?
>
> Let me go push a bit harder. But I expect we are in the grey zone between bug
> and feature here; this is a performance bug fix, not a new feature. By
> selectively backporting I'm guessing they are avoiding the risk of new features
> that a new kernel brings introducing new bugs? I'm guessing there is a higher
> qualification bar for that.
Why can't they use the kernel from bookworm-backports, which is 6.12?
ChenYu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation
2026-02-17 14:26 ` Greg KH
@ 2026-02-17 14:43 ` Ryan Roberts
2026-02-18 9:33 ` Ryan Roberts
0 siblings, 1 reply; 13+ messages in thread
From: Ryan Roberts @ 2026-02-17 14:43 UTC (permalink / raw)
To: Greg KH
Cc: stable, catalin.marinas, will, linux-arm-kernel, linux-kernel,
Jack Aboutboul, Sharath George John, Noah Meyerhans, Jim Perrin
On 17/02/2026 14:26, Greg KH wrote:
> On Tue, Feb 17, 2026 at 02:21:30PM +0000, Ryan Roberts wrote:
>> On 17/02/2026 14:10, Greg KH wrote:
>>> On Tue, Feb 17, 2026 at 01:58:36PM +0000, Ryan Roberts wrote:
>>>> On 17/02/2026 13:50, Greg KH wrote:
>>>>> On Tue, Feb 17, 2026 at 01:34:05PM +0000, Ryan Roberts wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> This series is a backport that applies to stable kernel 6.6 (base v6.6.126), for
>>>>>> some speed ups to enable significantly faster booting on systems with a lot of
>>>>>> memory. The patches were originally posted at:
>>>>>>
>>>>>> https://lore.kernel.org/linux-arm-kernel/20240412131908.433043-1-ryan.roberts@arm.com/
>>>>>>
>>>>>> ... and were originally merged upstream in v6.10-rc1.
>>>>>>
>>>>>> I'm requesting this be merged to stable on behalf of a partner who wants to get
>>>>>> the benefit of this series in Debian 12.
>>>>>
>>>>> Why can't they just use a newer kernel version (i.e. 6.12)? Surely they
>>>>> would be able to justify moving to a newer kernel for performance
>>>>> reasons, why enable them to stay on an older one, just delaying the
>>>>> inevitable upgrade they will have to do anyway in a year or so?
>>>>
>>>> I can't answer this presicely, but I did ask and push for that approach. As I
>>>> understand it, they are stuck with Debian 12, which is stuck with kernel 6.1.
>>>> The Debian maintainer apparently requested that these go through stable in order
>>>> to get them into Debian 12.
>>>
>>> I understand the position of Debian not wanting to take patches for new
>>> features that are not already upstream, but really, Debian offers a
>>> newer kernel for hardware that wants to use it for things like this,
>>> right? Why not just use that instead?
>>
>> Let me go push a bit harder. But I expect we are in the grey zone between bug
>> and feature here; this is a performance bug fix, not a new feature. By
>> selectively backporting I'm guessing they are avoiding the risk of new features
>> that a new kernel brings introducing new bugs? I'm guessing there is a higher
>> qualification bar for that.
>
> That's a broken "qualification system" if that is the case, given that
> the patches that flow back into stable kernel releases should be
> triggering "full qualification" if anyone actually paid attention to
> what goes into there :)
>
> Anyway, good luck! And same for 6.1.y, if they are ok with 6.6.y, why
> would they even care about 6.1.y?
The request was only for 6.1. I did 6.6 as well for continuity; I didn't want it
to get slow again if they moved from 6.1 to 6.6. It's already fixed in 6.12.
>
> thanks,
>
> greg k-h
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation
2026-02-17 14:43 ` Ryan Roberts
@ 2026-02-18 9:33 ` Ryan Roberts
0 siblings, 0 replies; 13+ messages in thread
From: Ryan Roberts @ 2026-02-18 9:33 UTC (permalink / raw)
To: Greg KH
Cc: stable, catalin.marinas, will, linux-arm-kernel, linux-kernel,
Jack Aboutboul, Sharath George John, Noah Meyerhans, Jim Perrin
On 17/02/2026 14:43, Ryan Roberts wrote:
> On 17/02/2026 14:26, Greg KH wrote:
>> On Tue, Feb 17, 2026 at 02:21:30PM +0000, Ryan Roberts wrote:
>>> On 17/02/2026 14:10, Greg KH wrote:
>>>> On Tue, Feb 17, 2026 at 01:58:36PM +0000, Ryan Roberts wrote:
>>>>> On 17/02/2026 13:50, Greg KH wrote:
>>>>>> On Tue, Feb 17, 2026 at 01:34:05PM +0000, Ryan Roberts wrote:
>>>>>>> Hi All,
>>>>>>>
>>>>>>> This series is a backport that applies to stable kernel 6.6 (base v6.6.126), for
>>>>>>> some speed ups to enable significantly faster booting on systems with a lot of
>>>>>>> memory. The patches were originally posted at:
>>>>>>>
>>>>>>> https://lore.kernel.org/linux-arm-kernel/20240412131908.433043-1-ryan.roberts@arm.com/
>>>>>>>
>>>>>>> ... and were originally merged upstream in v6.10-rc1.
>>>>>>>
>>>>>>> I'm requesting this be merged to stable on behalf of a partner who wants to get
>>>>>>> the benefit of this series in Debian 12.
>>>>>>
>>>>>> Why can't they just use a newer kernel version (i.e. 6.12)? Surely they
>>>>>> would be able to justify moving to a newer kernel for performance
>>>>>> reasons, why enable them to stay on an older one, just delaying the
>>>>>> inevitable upgrade they will have to do anyway in a year or so?
>>>>>
>>>>> I can't answer this presicely, but I did ask and push for that approach. As I
>>>>> understand it, they are stuck with Debian 12, which is stuck with kernel 6.1.
>>>>> The Debian maintainer apparently requested that these go through stable in order
>>>>> to get them into Debian 12.
>>>>
>>>> I understand the position of Debian not wanting to take patches for new
>>>> features that are not already upstream, but really, Debian offers a
>>>> newer kernel for hardware that wants to use it for things like this,
>>>> right? Why not just use that instead?
>>>
>>> Let me go push a bit harder. But I expect we are in the grey zone between bug
>>> and feature here; this is a performance bug fix, not a new feature. By
>>> selectively backporting I'm guessing they are avoiding the risk of new features
>>> that a new kernel brings introducing new bugs? I'm guessing there is a higher
>>> qualification bar for that.
>>
>> That's a broken "qualification system" if that is the case, given that
>> the patches that flow back into stable kernel releases should be
>> triggering "full qualification" if anyone actually paid attention to
>> what goes into there :)
>>
>> Anyway, good luck! And same for 6.1.y, if they are ok with 6.6.y, why
>> would they even care about 6.1.y?
>
> The request was only for 6.1. I did 6.6 as well for continuity; I didn't want it
> to get slow again if they moved from 6.1 to 6.6. It's already fixed in 6.12.
Hi Greg,
I thought a bit more about this overnight, and decided I wanted to have one more
go at convincing you...
In case you didn't read the commit logs, this series fixes a pretty nasty
performace bug; for a machine with 512G of RAM, it previously took 17.5 seconds
to create the linear map, and with the changes, it's down to 1.2 seconds. That's
quite a big quality-of-life improvement if you are booting VMs regularly.
(personally I hit this quite a bit).
It's a low risk change - it's been in since v6.10 and is part of arm64's core
boot path - and no issues have ever been raised.
Are you sure this isn't the sort of change that should be considered for stable?
Thanks,
Ryan
>
>
>>
>> thanks,
>>
>> greg k-h
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation
2026-02-17 14:27 ` Chen-Yu Tsai
@ 2026-02-18 19:49 ` Noah Meyerhans
0 siblings, 0 replies; 13+ messages in thread
From: Noah Meyerhans @ 2026-02-18 19:49 UTC (permalink / raw)
To: Chen-Yu Tsai
Cc: Ryan Roberts, Greg KH, stable, catalin.marinas, will,
linux-arm-kernel, linux-kernel, Jack Aboutboul,
Sharath George John, Noah Meyerhans, Jim Perrin
On Tue, Feb 17, 2026 at 10:27:53PM +0800, Chen-Yu Tsai wrote:
>
> On Tue, Feb 17, 2026 at 10:21 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
> >
> > On 17/02/2026 14:10, Greg KH wrote:
> > > On Tue, Feb 17, 2026 at 01:58:36PM +0000, Ryan Roberts wrote:
> > >> On 17/02/2026 13:50, Greg KH wrote:
> > >>> On Tue, Feb 17, 2026 at 01:34:05PM +0000, Ryan Roberts wrote:
> > >>>> Hi All,
> > >>>>
> > >>>> This series is a backport that applies to stable kernel 6.6 (base v6.6.126), for
> > >>>> some speed ups to enable significantly faster booting on systems with a lot of
> > >>>> memory. The patches were originally posted at:
> > >>>>
> > >>>> https://lore.kernel.org/linux-arm-kernel/20240412131908.433043-1-ryan.roberts@arm.com/
> > >>>>
> > >>>> ... and were originally merged upstream in v6.10-rc1.
> > >>>>
> > >>>> I'm requesting this be merged to stable on behalf of a partner who wants to get
> > >>>> the benefit of this series in Debian 12.
> > >>>
> > >>> Why can't they just use a newer kernel version (i.e. 6.12)? Surely they
> > >>> would be able to justify moving to a newer kernel for performance
> > >>> reasons, why enable them to stay on an older one, just delaying the
> > >>> inevitable upgrade they will have to do anyway in a year or so?
> > >>
> > >> I can't answer this presicely, but I did ask and push for that approach. As I
> > >> understand it, they are stuck with Debian 12, which is stuck with kernel 6.1.
> > >> The Debian maintainer apparently requested that these go through stable in order
> > >> to get them into Debian 12.
> > >
> > > I understand the position of Debian not wanting to take patches for new
> > > features that are not already upstream, but really, Debian offers a
> > > newer kernel for hardware that wants to use it for things like this,
> > > right? Why not just use that instead?
> >
> > Let me go push a bit harder. But I expect we are in the grey zone between bug
> > and feature here; this is a performance bug fix, not a new feature. By
> > selectively backporting I'm guessing they are avoiding the risk of new features
> > that a new kernel brings introducing new bugs? I'm guessing there is a higher
> > qualification bar for that.
>
> Why can't they use the kernel from bookworm-backports, which is 6.12?
Bookworm-backports will likely be our recommendation should this
patchset ultimately be rejected. Debian 12 uses 6.1.y by default.
While 6.12.y is available for that release via the bookworm-backports
repository, bookworm-backports content is not generally recommended for
production usage. It's not necessarily updated on the same cadence as
the 6.1.y packages, and Debian does not publish security advisories for
it.
Debian does not want to maintain this change as a downstream patch,
which is fair. Microsoft would like to make this boot optimization
available by default to Debian 12 users (of which there are still many)
which is why we're pursuing this path. Naturally, we'll manage if we
can't get the change applied to 6.1.y, but hopefully this explains where
we're coming from.
noah
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2026-02-18 19:49 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-17 13:34 [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation Ryan Roberts
2026-02-17 13:34 ` [PATCH 6.6 1/3] arm64: mm: Don't remap pgtables per-cont(pte|pmd) block Ryan Roberts
2026-02-17 13:34 ` [PATCH 6.6 2/3] arm64: mm: Batch dsb and isb when populating pgtables Ryan Roberts
2026-02-17 13:34 ` [PATCH 6.6 3/3] arm64: mm: Don't remap pgtables for allocate vs populate Ryan Roberts
2026-02-17 13:50 ` [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation Greg KH
2026-02-17 13:58 ` Ryan Roberts
2026-02-17 14:10 ` Greg KH
2026-02-17 14:21 ` Ryan Roberts
2026-02-17 14:26 ` Greg KH
2026-02-17 14:43 ` Ryan Roberts
2026-02-18 9:33 ` Ryan Roberts
2026-02-17 14:27 ` Chen-Yu Tsai
2026-02-18 19:49 ` Noah Meyerhans
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox