All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-kernel@vger.kernel.org, x86@kernel.org, tglx@linutronix.de,
	bp@alien8.de, joro@8bytes.org, luto@kernel.org,
	peterz@infradead.org, kirill.shutemov@linux.intel.com,
	rick.p.edgecombe@intel.com, jgross@suse.com
Subject: Re: [RFC][PATCH 0/8] x86/mm: Simplify PAE page table handling
Date: Mon, 24 Feb 2025 19:55:01 +0100	[thread overview]
Message-ID: <Z7zAhSAzpU_MCGnO@gmail.com> (raw)
In-Reply-To: <20250123172428.D6D8C8D9@davehans-spike.ostc.intel.com>


* Dave Hansen <dave.hansen@linux.intel.com> wrote:

> tl;dr: 32-bit PAE page table handing is a bit different when PTI
> is on and off. Making the handling uniform removes a good amount
> of code at the cost of not sharing kernel PMDs. The downside of
> this simplification is bloating non-PTI PAE kernels by ~2 pages
> per process.
> 
> Anyone who cares about security on 32-bit is running with PTI and
> PAE because PAE has the No-eXecute page table bit. They are already
> paying the 2-page penalty. Anyone who cares more about memory
> footprint than security is probably already running a !PAE kernel
> and will not be affected by this.
> 
> --
> 
> There are two 32-bit x86 hardware page table formats. A 2-level one
> with 32-bit pte_t's and a 3-level one with 64-bit pte_t's called PAE.
> But the PAE one is wonky. It effectively loses a bit of addressing
> radix per level since its PTEs are twice as large. It makes up for
> that by adding the third level, but with only 4 entries in the level.
> 
> This leads to all kinds of fun because this level only needs 32 bytes
> instead of a whole page. Also, since it has only 4 entries in the top
> level, the hardware just always caches the entire thing aggressively.
> Modifying a PAE pgd_t ends up needing different rules than the other
> other x86 paging modes and probably every other architecture too.
> 
> PAE support got even weirder when Xen came along. Xen wants to trap
> into the hypervisor on page table writes and so it protects the guest
> page tables with paging protections. It can't protect a 32 byte
> object with paging protections so it bloats the 32-byte object out
> to a page. Xen also didn't support sharing kernel PMD pages.  This
> is mostly moot now because the Xen support running as a 32-bit guest
> was ripped out, but there are still remnants around.
> 
> PAE also interacts with PTI in fun and exciting ways. Since pgd
> updates are so fraught, the PTI PAE implementation just chose to
> avoid pgd updates by preallocating all the PMDs up front since
> there are only 4 instead of 512 or 1024 in the other x86 paging
> modes.
> 
> Make PAE less weird:
>  * Always allocate a page for PAE PGDs. This brings them in line
>    with the other 2 paging modes. It was done for Xen and for
>    PTI already and nobody screamed, so just do it everywhere.
>  * Never share kernel PMD pages. This brings PAE in line with
>    32-bit !PAE and 64-bit.
>  * Always preallocate all PAE PMD pages. This basically makes
>    all PAE kernels behave like PTI ones. It might waste a page
>    of memory, but all 4 pages probably get allocated in the common
>    case anyway.
> 
> --
> 
>  include/asm/pgtable-2level_types.h |    2
>  include/asm/pgtable-3level_types.h |    4 -
>  include/asm/pgtable_64_types.h     |    2
>  mm/pat/set_memory.c                |    2
>  mm/pgtable.c                       |  104 +++++--------------------------------
>  5 files changed, 18 insertions(+), 96 deletions(-)

The diffstat alone is pretty nice, so I'd suggest we pursue this series 
even if continued work on 32-bit kernel features is being questioned. 
Until the code exists and isn't explicitly marked as obsolete, such 
changes are legit.

Thanks,

	Ingo

      parent reply	other threads:[~2025-02-24 18:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-23 17:24 [RFC][PATCH 0/8] x86/mm: Simplify PAE page table handling Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 1/8] x86/mm: Always allocate a whole page for PAE PGDs Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 2/8] x86/mm: Always "broadcast" PMD setting operations Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 3/8] x86/mm: Always tell core mm to sync kernel mappings Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 4/8] x86/mm: Simplify PAE PGD sharing macros Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 5/8] x86/mm: Fix up comments around PMD preallocation Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 6/8] x86/mm: Preallocate all PAE page tables Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 7/8] x86/mm: Remove duplicated PMD preallocation macro Dave Hansen
2025-01-23 17:24 ` [RFC][PATCH 8/8] x86/mm: Remove now unused SHARED_KERNEL_PMD Dave Hansen
2025-01-23 21:49 ` [RFC][PATCH 0/8] x86/mm: Simplify PAE page table handling Peter Zijlstra
2025-01-23 23:06   ` Dave Hansen
2025-01-24  7:58     ` Joerg Roedel
2025-01-24 19:12       ` Dave Hansen
2025-01-28  8:13         ` Joerg Roedel
2025-01-24  8:52     ` Peter Zijlstra
2025-02-24 18:55 ` Ingo Molnar [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z7zAhSAzpU_MCGnO@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=jgross@suse.com \
    --cc=joro@8bytes.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.