public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org, tglx@linutronix.de,
	bp@alien8.de, joro@8bytes.org, luto@kernel.org,
	peterz@infradead.org, kirill.shutemov@linux.intel.com,
	rick.p.edgecombe@intel.com, jgross@suse.com
Subject: Re: [RFC][PATCH 0/8] x86/mm: Simplify PAE page table handling
Date: Mon, 24 Feb 2025 19:55:01 +0100	[thread overview]
Message-ID: <Z7zAhSAzpU_MCGnO@gmail.com> (raw)
In-Reply-To: <20250123172428.D6D8C8D9@davehans-spike.ostc.intel.com>


* Dave Hansen <dave.hansen@linux.intel.com> wrote:

> tl;dr: 32-bit PAE page table handing is a bit different when PTI
> is on and off. Making the handling uniform removes a good amount
> of code at the cost of not sharing kernel PMDs. The downside of
> this simplification is bloating non-PTI PAE kernels by ~2 pages
> per process.
> 
> Anyone who cares about security on 32-bit is running with PTI and
> PAE because PAE has the No-eXecute page table bit. They are already
> paying the 2-page penalty. Anyone who cares more about memory
> footprint than security is probably already running a !PAE kernel
> and will not be affected by this.
> 
> --
> 
> There are two 32-bit x86 hardware page table formats. A 2-level one
> with 32-bit pte_t's and a 3-level one with 64-bit pte_t's called PAE.
> But the PAE one is wonky. It effectively loses a bit of addressing
> radix per level since its PTEs are twice as large. It makes up for
> that by adding the third level, but with only 4 entries in the level.
> 
> This leads to all kinds of fun because this level only needs 32 bytes
> instead of a whole page. Also, since it has only 4 entries in the top
> level, the hardware just always caches the entire thing aggressively.
> Modifying a PAE pgd_t ends up needing different rules than the other
> other x86 paging modes and probably every other architecture too.
> 
> PAE support got even weirder when Xen came along. Xen wants to trap
> into the hypervisor on page table writes and so it protects the guest
> page tables with paging protections. It can't protect a 32 byte
> object with paging protections so it bloats the 32-byte object out
> to a page. Xen also didn't support sharing kernel PMD pages.  This
> is mostly moot now because the Xen support running as a 32-bit guest
> was ripped out, but there are still remnants around.
> 
> PAE also interacts with PTI in fun and exciting ways. Since pgd
> updates are so fraught, the PTI PAE implementation just chose to
> avoid pgd updates by preallocating all the PMDs up front since
> there are only 4 instead of 512 or 1024 in the other x86 paging
> modes.
> 
> Make PAE less weird:
>  * Always allocate a page for PAE PGDs. This brings them in line
>    with the other 2 paging modes. It was done for Xen and for
>    PTI already and nobody screamed, so just do it everywhere.
>  * Never share kernel PMD pages. This brings PAE in line with
>    32-bit !PAE and 64-bit.
>  * Always preallocate all PAE PMD pages. This basically makes
>    all PAE kernels behave like PTI ones. It might waste a page
>    of memory, but all 4 pages probably get allocated in the common
>    case anyway.
> 
> --
> 
>  include/asm/pgtable-2level_types.h |    2
>  include/asm/pgtable-3level_types.h |    4 -
>  include/asm/pgtable_64_types.h     |    2
>  mm/pat/set_memory.c                |    2
>  mm/pgtable.c                       |  104 +++++--------------------------------
>  5 files changed, 18 insertions(+), 96 deletions(-)

The diffstat alone is pretty nice, so I'd suggest we pursue this series 
even if continued work on 32-bit kernel features is being questioned. 
Until the code exists and isn't explicitly marked as obsolete, such 
changes are legit.

Thanks,

	Ingo

      parent reply	other threads:[~2025-02-24 18:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-23 17:24 [RFC][PATCH 0/8] x86/mm: Simplify PAE page table handling Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 1/8] x86/mm: Always allocate a whole page for PAE PGDs Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 2/8] x86/mm: Always "broadcast" PMD setting operations Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 3/8] x86/mm: Always tell core mm to sync kernel mappings Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 4/8] x86/mm: Simplify PAE PGD sharing macros Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 5/8] x86/mm: Fix up comments around PMD preallocation Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 6/8] x86/mm: Preallocate all PAE page tables Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 7/8] x86/mm: Remove duplicated PMD preallocation macro Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 8/8] x86/mm: Remove now unused SHARED_KERNEL_PMD Dave Hansen
2025-01-23 21:49 ` [RFC][PATCH 0/8] x86/mm: Simplify PAE page table handling Peter Zijlstra
2025-01-23 23:06   ` Dave Hansen
2025-01-24  7:58     ` Joerg Roedel
2025-01-24 19:12       ` Dave Hansen
2025-01-28  8:13         ` Joerg Roedel
2025-01-24  8:52     ` Peter Zijlstra
2025-02-24 18:55 ` Ingo Molnar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z7zAhSAzpU_MCGnO@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=jgross@suse.com \
    --cc=joro@8bytes.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox