Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: <gregkh@linuxfoundation.org>
To: Jim.Perrin@microsoft.com,ardb@kernel.org,catalin.marinas@arm.com,echanude@redhat.com,itaru.kitayama@fujitsu.com,jaboutboul@microsoft.com,linux-arm-kernel@lists.infradead.org,mark.rutland@arm.com,nmeyerhans@microsoft.com,ryan.roberts@arm.com,sgeorgejohn@microsoft.com,will@kernel.org
Cc: <stable-commits@vger.kernel.org>
Subject: Patch "arm64: mm: Don't remap pgtables per-cont(pte|pmd) block" has been added to the 6.6-stable tree
Date: Thu, 19 Mar 2026 12:12:51 +0100	[thread overview]
Message-ID: <2026031951-disband-calzone-f832@gregkh> (raw)
In-Reply-To: <20260217133411.2881311-2-ryan.roberts@arm.com>


This is a note to let you know that I've just added the patch titled

    arm64: mm: Don't remap pgtables per-cont(pte|pmd) block

to the 6.6-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     arm64-mm-don-t-remap-pgtables-per-cont-pte-pmd-block.patch
and it can be found in the queue-6.6 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


From stable+bounces-216825-greg=kroah.com@vger.kernel.org Tue Feb 17 14:34:47 2026
From: Ryan Roberts <ryan.roberts@arm.com>
Date: Tue, 17 Feb 2026 13:34:06 +0000
Subject: arm64: mm: Don't remap pgtables per-cont(pte|pmd) block
To: stable@vger.kernel.org
Cc: Ryan Roberts <ryan.roberts@arm.com>, catalin.marinas@arm.com, will@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Jack Aboutboul <jaboutboul@microsoft.com>, Sharath George John <sgeorgejohn@microsoft.com>, Noah Meyerhans <nmeyerhans@microsoft.com>, Jim Perrin <Jim.Perrin@microsoft.com>, Itaru Kitayama <itaru.kitayama@fujitsu.com>, Eric Chanudet <echanude@redhat.com>, Mark Rutland <mark.rutland@arm.com>, Ard Biesheuvel <ardb@kernel.org>
Message-ID: <20260217133411.2881311-2-ryan.roberts@arm.com>

From: Ryan Roberts <ryan.roberts@arm.com>

[ Upstream commit 5c63db59c5f89925add57642be4f789d0d671ccd ]

A large part of the kernel boot time is creating the kernel linear map
page tables. When rodata=full, all memory is mapped by pte. And when
there is lots of physical ram, there are lots of pte tables to populate.
The primary cost associated with this is mapping and unmapping the pte
table memory in the fixmap; at unmap time, the TLB entry must be
invalidated and this is expensive.

Previously, each pmd and pte table was fixmapped/fixunmapped for each
cont(pte|pmd) block of mappings (16 entries with 4K granule). This means
we ended up issuing 32 TLBIs per (pmd|pte) table during the population
phase.

Let's fix that, and fixmap/fixunmap each page once per population, for a
saving of 31 TLBIs per (pmd|pte) table. This gives a significant boot
speedup.

Execution time of map_mem(), which creates the kernel linear map page
tables, was measured on different machines with different RAM configs:

               | Apple M2 VM | Ampere Altra| Ampere Altra| Ampere Altra
               | VM, 16G     | VM, 64G     | VM, 256G    | Metal, 512G
---------------|-------------|-------------|-------------|-------------
               |   ms    (%) |   ms    (%) |   ms    (%) |    ms    (%)
---------------|-------------|-------------|-------------|-------------
before         |  168   (0%) | 2198   (0%) | 8644   (0%) | 17447   (0%)
after          |   78 (-53%) |  435 (-80%) | 1723 (-80%) |  3779 (-78%)

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Tested-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Tested-by: Eric Chanudet <echanude@redhat.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20240412131908.433043-2-ryan.roberts@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
[ Ryan: Trivial backport ]
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/arm64/mm/mmu.c |   27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -169,12 +169,9 @@ bool pgattr_change_is_safe(u64 old, u64
 	return ((old ^ new) & ~mask) == 0;
 }
 
-static void init_pte(pmd_t *pmdp, unsigned long addr, unsigned long end,
+static void init_pte(pte_t *ptep, unsigned long addr, unsigned long end,
 		     phys_addr_t phys, pgprot_t prot)
 {
-	pte_t *ptep;
-
-	ptep = pte_set_fixmap_offset(pmdp, addr);
 	do {
 		pte_t old_pte = READ_ONCE(*ptep);
 
@@ -189,8 +186,6 @@ static void init_pte(pmd_t *pmdp, unsign
 
 		phys += PAGE_SIZE;
 	} while (ptep++, addr += PAGE_SIZE, addr != end);
-
-	pte_clear_fixmap();
 }
 
 static void alloc_init_cont_pte(pmd_t *pmdp, unsigned long addr,
@@ -201,6 +196,7 @@ static void alloc_init_cont_pte(pmd_t *p
 {
 	unsigned long next;
 	pmd_t pmd = READ_ONCE(*pmdp);
+	pte_t *ptep;
 
 	BUG_ON(pmd_sect(pmd));
 	if (pmd_none(pmd)) {
@@ -216,6 +212,7 @@ static void alloc_init_cont_pte(pmd_t *p
 	}
 	BUG_ON(pmd_bad(pmd));
 
+	ptep = pte_set_fixmap_offset(pmdp, addr);
 	do {
 		pgprot_t __prot = prot;
 
@@ -226,20 +223,21 @@ static void alloc_init_cont_pte(pmd_t *p
 		    (flags & NO_CONT_MAPPINGS) == 0)
 			__prot = __pgprot(pgprot_val(prot) | PTE_CONT);
 
-		init_pte(pmdp, addr, next, phys, __prot);
+		init_pte(ptep, addr, next, phys, __prot);
 
+		ptep += pte_index(next) - pte_index(addr);
 		phys += next - addr;
 	} while (addr = next, addr != end);
+
+	pte_clear_fixmap();
 }
 
-static void init_pmd(pud_t *pudp, unsigned long addr, unsigned long end,
+static void init_pmd(pmd_t *pmdp, unsigned long addr, unsigned long end,
 		     phys_addr_t phys, pgprot_t prot,
 		     phys_addr_t (*pgtable_alloc)(int), int flags)
 {
 	unsigned long next;
-	pmd_t *pmdp;
 
-	pmdp = pmd_set_fixmap_offset(pudp, addr);
 	do {
 		pmd_t old_pmd = READ_ONCE(*pmdp);
 
@@ -265,8 +263,6 @@ static void init_pmd(pud_t *pudp, unsign
 		}
 		phys += next - addr;
 	} while (pmdp++, addr = next, addr != end);
-
-	pmd_clear_fixmap();
 }
 
 static void alloc_init_cont_pmd(pud_t *pudp, unsigned long addr,
@@ -276,6 +272,7 @@ static void alloc_init_cont_pmd(pud_t *p
 {
 	unsigned long next;
 	pud_t pud = READ_ONCE(*pudp);
+	pmd_t *pmdp;
 
 	/*
 	 * Check for initial section mappings in the pgd/pud.
@@ -294,6 +291,7 @@ static void alloc_init_cont_pmd(pud_t *p
 	}
 	BUG_ON(pud_bad(pud));
 
+	pmdp = pmd_set_fixmap_offset(pudp, addr);
 	do {
 		pgprot_t __prot = prot;
 
@@ -304,10 +302,13 @@ static void alloc_init_cont_pmd(pud_t *p
 		    (flags & NO_CONT_MAPPINGS) == 0)
 			__prot = __pgprot(pgprot_val(prot) | PTE_CONT);
 
-		init_pmd(pudp, addr, next, phys, __prot, pgtable_alloc, flags);
+		init_pmd(pmdp, addr, next, phys, __prot, pgtable_alloc, flags);
 
+		pmdp += pmd_index(next) - pmd_index(addr);
 		phys += next - addr;
 	} while (addr = next, addr != end);
+
+	pmd_clear_fixmap();
 }
 
 static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end,


Patches currently in stable-queue which might be from ryan.roberts@arm.com are

queue-6.6/arm64-mm-don-t-remap-pgtables-per-cont-pte-pmd-block.patch
queue-6.6/arm64-mm-don-t-remap-pgtables-for-allocate-vs-populate.patch
queue-6.6/arm64-mm-batch-dsb-and-isb-when-populating-pgtables.patch


  reply	other threads:[~2026-03-19 11:13 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-17 13:34 [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation Ryan Roberts
2026-02-17 13:34 ` [PATCH 6.6 1/3] arm64: mm: Don't remap pgtables per-cont(pte|pmd) block Ryan Roberts
2026-03-19 11:12   ` gregkh [this message]
2026-02-17 13:34 ` [PATCH 6.6 2/3] arm64: mm: Batch dsb and isb when populating pgtables Ryan Roberts
2026-03-19 11:12   ` Patch "arm64: mm: Batch dsb and isb when populating pgtables" has been added to the 6.6-stable tree gregkh
2026-02-17 13:34 ` [PATCH 6.6 3/3] arm64: mm: Don't remap pgtables for allocate vs populate Ryan Roberts
2026-03-19 11:12   ` Patch "arm64: mm: Don't remap pgtables for allocate vs populate" has been added to the 6.6-stable tree gregkh
2026-02-17 13:50 ` [PATCH 6.6 0/3] arm64: Speed up boot with faster linear map creation Greg KH
2026-02-17 13:58   ` Ryan Roberts
2026-02-17 14:10     ` Greg KH
2026-02-17 14:21       ` Ryan Roberts
2026-02-17 14:26         ` Greg KH
2026-02-17 14:43           ` Ryan Roberts
2026-02-18  9:33             ` Ryan Roberts
2026-02-17 14:27         ` Chen-Yu Tsai
2026-02-18 19:49           ` Noah Meyerhans

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2026031951-disband-calzone-f832@gregkh \
    --to=gregkh@linuxfoundation.org \
    --cc=Jim.Perrin@microsoft.com \
    --cc=ardb@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=echanude@redhat.com \
    --cc=itaru.kitayama@fujitsu.com \
    --cc=jaboutboul@microsoft.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mark.rutland@arm.com \
    --cc=nmeyerhans@microsoft.com \
    --cc=ryan.roberts@arm.com \
    --cc=sgeorgejohn@microsoft.com \
    --cc=stable-commits@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox