public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/8] x86/mm: Simplify PAE page table handling
@ 2025-01-23 17:24 Dave Hansen
  2025-01-23 17:24 ` [RFC][PATCH 1/8] x86/mm: Always allocate a whole page for PAE PGDs Dave Hansen
                   ` (9 more replies)
  0 siblings, 10 replies; 16+ messages in thread
From: Dave Hansen @ 2025-01-23 17:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: x86, tglx, bp, joro, luto, peterz, kirill.shutemov,
	rick.p.edgecombe, jgross, Dave Hansen

tl;dr: 32-bit PAE page table handing is a bit different when PTI
is on and off. Making the handling uniform removes a good amount
of code at the cost of not sharing kernel PMDs. The downside of
this simplification is bloating non-PTI PAE kernels by ~2 pages
per process.

Anyone who cares about security on 32-bit is running with PTI and
PAE because PAE has the No-eXecute page table bit. They are already
paying the 2-page penalty. Anyone who cares more about memory
footprint than security is probably already running a !PAE kernel
and will not be affected by this.

--

There are two 32-bit x86 hardware page table formats. A 2-level one
with 32-bit pte_t's and a 3-level one with 64-bit pte_t's called PAE.
But the PAE one is wonky. It effectively loses a bit of addressing
radix per level since its PTEs are twice as large. It makes up for
that by adding the third level, but with only 4 entries in the level.

This leads to all kinds of fun because this level only needs 32 bytes
instead of a whole page. Also, since it has only 4 entries in the top
level, the hardware just always caches the entire thing aggressively.
Modifying a PAE pgd_t ends up needing different rules than the other
other x86 paging modes and probably every other architecture too.

PAE support got even weirder when Xen came along. Xen wants to trap
into the hypervisor on page table writes and so it protects the guest
page tables with paging protections. It can't protect a 32 byte
object with paging protections so it bloats the 32-byte object out
to a page. Xen also didn't support sharing kernel PMD pages.  This
is mostly moot now because the Xen support running as a 32-bit guest
was ripped out, but there are still remnants around.

PAE also interacts with PTI in fun and exciting ways. Since pgd
updates are so fraught, the PTI PAE implementation just chose to
avoid pgd updates by preallocating all the PMDs up front since
there are only 4 instead of 512 or 1024 in the other x86 paging
modes.

Make PAE less weird:
 * Always allocate a page for PAE PGDs. This brings them in line
   with the other 2 paging modes. It was done for Xen and for
   PTI already and nobody screamed, so just do it everywhere.
 * Never share kernel PMD pages. This brings PAE in line with
   32-bit !PAE and 64-bit.
 * Always preallocate all PAE PMD pages. This basically makes
   all PAE kernels behave like PTI ones. It might waste a page
   of memory, but all 4 pages probably get allocated in the common
   case anyway.

--

 include/asm/pgtable-2level_types.h |    2
 include/asm/pgtable-3level_types.h |    4 -
 include/asm/pgtable_64_types.h     |    2
 mm/pat/set_memory.c                |    2
 mm/pgtable.c                       |  104 +++++--------------------------------
 5 files changed, 18 insertions(+), 96 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-02-24 18:55 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-23 17:24 [RFC][PATCH 0/8] x86/mm: Simplify PAE page table handling Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 1/8] x86/mm: Always allocate a whole page for PAE PGDs Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 2/8] x86/mm: Always "broadcast" PMD setting operations Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 3/8] x86/mm: Always tell core mm to sync kernel mappings Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 4/8] x86/mm: Simplify PAE PGD sharing macros Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 5/8] x86/mm: Fix up comments around PMD preallocation Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 6/8] x86/mm: Preallocate all PAE page tables Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 7/8] x86/mm: Remove duplicated PMD preallocation macro Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 8/8] x86/mm: Remove now unused SHARED_KERNEL_PMD Dave Hansen
2025-01-23 21:49 ` [RFC][PATCH 0/8] x86/mm: Simplify PAE page table handling Peter Zijlstra
2025-01-23 23:06   ` Dave Hansen
2025-01-24  7:58     ` Joerg Roedel
2025-01-24 19:12       ` Dave Hansen
2025-01-28  8:13         ` Joerg Roedel
2025-01-24  8:52     ` Peter Zijlstra
2025-02-24 18:55 ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox