public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
From: Kevin Brodsky <kevin.brodsky@arm.com>
To: linux-hardening@vger.kernel.org
Cc: linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	David Hildenbrand <david@redhat.com>,
	Ira Weiny <ira.weiny@intel.com>, Jann Horn <jannh@google.com>,
	Jeff Xu <jeffxu@chromium.org>, Joey Gouly <joey.gouly@arm.com>,
	Kees Cook <kees@kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Marc Zyngier <maz@kernel.org>, Mark Brown <broonie@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Maxwell Bland <mbland@motorola.com>,
	"Mike Rapoport (IBM)" <rppt@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Pierre Langlois <pierre.langlois@arm.com>,
	Quentin Perret <qperret@google.com>,
	Rick Edgecombe <rick.p.edgecombe@intel.com>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vlastimil Babka <vbabka@suse.cz>, Will Deacon <will@kernel.org>,
	Yang Shi <yang@os.amperecomputing.com>,
	Yeoreum Yun <yeoreum.yun@arm.com>,
	linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
	x86@kernel.org
Subject: Re: [PATCH v6 07/30] arm64: Reset POR_EL1 on exception entry
Date: Tue, 5 May 2026 17:42:50 +0200	[thread overview]
Message-ID: <7dc9485d-a822-494d-9384-4a973c782c11@arm.com> (raw)
In-Reply-To: <20260227175518.3728055-8-kevin.brodsky@arm.com>

On 27/02/2026 18:54, Kevin Brodsky wrote:
> POR_EL1 will be modified, through the kpkeys framework, in order to
> grant temporary RW access to certain keys. If an exception occurs
> in the middle of a "critical section" where POR_EL1 is set to a
> privileged value, it is preferable to reset it to its default value
> upon taking the exception to minimise the amount of code running at
> higher kpkeys level.

It turns out there is a corner case where this doesn't play well with
patch 28 (batching using lazy MMU mode). I got the following splat:

    [   33.603892] Unable to handle kernel write to read-only memory at
virtual address ffff00087fbbbd78
    [   33.603969] Mem abort info:
    [   33.604028]   ESR = 0x000000409600004f
    [   33.604058]   EC = 0x25: DABT (current EL), IL = 32 bits
    [   33.604101]   SET = 0, FnV = 0
    [   33.604133]   EA = 0, S1PT
    ** replaying previous printk message **
    [   33.604133]   EA = 0, S1PTW = 0
    [   33.604165]   FSC = 0x0f: level 3 permission fault
    [   33.604200] Data abort info:
    [   33.604222]   ISV = 0, ISS = 0x0000004f, ISS2 = 0x00000040
    [   33.604259]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
    [   33.604303]   GCS = 0, Overlay = 1, DirtyBit = 0, Xs = 0
    [   33.604345] swapper pgtable: 4k pages, 48-bit VAs,
pgdp=00000000eec2a000
    [   33.604397] [ffff00087fbbbd78] pgd=0000000000000000,
p4d=18000008fffff403, pud=18000008ffa2d403, pmd=18000008ff82f403,
pte=10e80008ffbbb707
    [   33.605031] Internal error: Oops: 000000409600004f [#1]  SMP
    [   33.605596] Modules linked in:
    [   33.605690] CPU: 0 UID: 0 PID: 1 Comm: systemd Tainted: G       
         N  7.1.0-rc2-00028-g497c3a31207b #371 PREEMPT
    [   33.605864] Tainted: [N]=TEST
    [   33.605933] Hardware name: FVP Base RevC (DT)
    [   33.606012] pstate: 141402009 (nZcv daif +PAN -UAO -TCO +DIT
-SSBS BTYPE=--)
    [   33.606140] pc : pageattr_pte_entry+0x18/0x118
    [   33.606272] lr : walk_pte_range_inner+0x1d8/0x480
    [   33.606393] sp : ffff80008005b5a0
    [   33.606467] x29: ffff80008005b5d0 x28: ffffa991675fd6b0 x27:
ffff00080e5b0000
    [   33.606662] x26: ffff00080e7af000 x25: 0010000000000001 x24:
0040000000000001
    [   33.606855] x23: 0040000000000041 x22: ffff00080e5b0000 x21:
ffff80008005b740
    [   33.607052] x20: ffff00087fbbbd78 x19: ffff00080e5af000 x18:
0000000000000000
    [   33.607245] x17: ffff0008001d2240 x16: 0000000000000004 x15:
0000000000000000
    [   33.607434] x14: ffff00080a80b810 x13: 000000000000b706 x12:
0000000000000001
    [   33.607622] x11: 0000000000000000 x10: 0000000000000000 x9 :
0000000000000020
    [   33.607809] x8 : ffffa991686a7130 x7 : ffff00880e5af000 x6 :
0000000000000072
    [   33.608000] x5 : 0000000000000003 x4 : ffffa9916625e028 x3 :
0000000000000002
    [   33.608187] x2 : 0000000000000000 x1 : 00e800088e5af707 x0 :
ffff00087fbbbd78
    [   33.608378] Call trace:
    [   33.608441]  pageattr_pte_entry+0x18/0x118 (P)
    [   33.608587]  walk_pgd_range+0x648/0x94c
    [   33.608716]  walk_kernel_page_table_range_lockless+0x5c/0x98
    [   33.608864]  update_range_prot+0x8c/0x1a4
    [   33.609007]  set_memory_pkey+0x48/0x80
    [   33.609149]  kpkeys_pgtable_free+0x40/0x9c
    [   33.609305]  pgd_free+0xd8/0x120
    [   33.609429]  __mmdrop+0x54/0x1d0
    [   33.609552]  finish_task_switch.isra.0+0x234/0x2c4
    [   33.609714]  __schedule+0x3ac/0xf00
    [   33.609860]  preempt_schedule_irq+0x3c/0x7c
    [   33.610013]  raw_irqentry_exit_cond_resched+0x2c/0x54
    [   33.610154]  arm64_exit_to_kernel_mode+0x40/0x5c
    [   33.610290]  el1_interrupt+0x48/0x60
    [   33.610416]  el1h_64_irq_handler+0x18/0x24
    [   33.610553]  el1h_64_irq+0x8c/0x90
    [   33.610672]  __vunmap_range_noflush+0x310/0x540 (P)
    [   33.610829]  remove_vm_area+0x50/0xa4
    [   33.610977]  vfree+0x38/0x274
    [   33.611118]  n_tty_close+0x40/0xa8
    [   33.611234]  tty_ldisc_close+0x4c/0xb0
    [   33.611360]  tty_ldisc_kill+0x30/0x64
    [   33.611485]  tty_ldisc_release+0xd0/0x1b0
    [   33.611615]  tty_release_struct+0x20/0x88
    [   33.611766]  tty_release+0x384/0x480
    [   33.611912]  __fput+0xd0/0x300
    [   33.612041]  fput_close_sync+0x38/0x108
    [   33.612180]  __arm64_sys_close+0x38/0x7c
    [   33.612308]  invoke_syscall.constprop.0+0x40/0x108
    [   33.612447]  el0_svc_common.constprop.0+0x38/0xd8
    [   33.612589]  do_el0_svc+0x1c/0x28
    [   33.612720]  el0_svc+0x38/0x148
    [   33.612846]  el0t_64_sync_handler+0xa0/0xe4
    [   33.612984]  el0t_64_sync+0x198/0x19c
    [   33.613137] Code: a9400c42 8a230021 aa020021 1400000a (f9000001)
    [   33.613230] ---[ end trace 0000000000000000 ]---
    [   33.974524] Kernel panic - not syncing: Oops: Fatal exception

What happened is that a thread entered lazy MMU mode in
vunmap_pte_range() (inlined) and then an IRQ fired. On the exit path of
the IRQ, another thread got scheduled. Later, the original thread was
scheduled again, and it so happened that finish_task_switch() had some
mm to drop (mmdrop_lazy_tlb_sched(mm)) and we got the last reference on
that mm. We then proceed to free the PGD and eventually write to a
linear map page table to reset the pkey.

Because this patch resets POR_EL1 on exception entry, anything running
before exception return uses the default POR_EL1 value, which does not
grant write access to page tables. This is indeed the intention, but as
this crash shows, it comes with an implicit assumption that the
context-switching machinery does not itself write to page tables (at
least not on the irqexit path).

This patch isn't functionally required for page table protection so it
will be dropped in RFC v7. Maybe lazy MMU mode could be paused for the
duration of finish_task_switch() instead, but I'm not sure whether this
is a generic enough solution.

- Kevin


  reply	other threads:[~2026-05-05 15:43 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-27 17:54 [PATCH v6 00/30] pkeys-based page table hardening Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 01/30] mm: Introduce kpkeys Kevin Brodsky
2026-04-15 13:00   ` David Hildenbrand (Arm)
2026-04-15 15:50     ` Kevin Brodsky
2026-04-17 12:00       ` David Hildenbrand (Arm)
2026-04-17 13:10         ` Kevin Brodsky
2026-04-17 14:37   ` David Hildenbrand (Arm)
2026-04-17 15:59     ` Kevin Brodsky
2026-04-17 17:38       ` David Hildenbrand (Arm)
2026-04-20  6:46         ` Kevin Brodsky
2026-04-20 18:49           ` David Hildenbrand (Arm)
2026-02-27 17:54 ` [PATCH v6 02/30] set_memory: Introduce set_memory_pkey() stub Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 03/30] arm64: mm: Enable overlays for all EL1 indirect permissions Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 04/30] arm64: Introduce por_elx_set_pkey_perms() helper Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 05/30] arm64: Implement asm/kpkeys.h using POE Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 06/30] arm64: set_memory: Implement set_memory_pkey() Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 07/30] arm64: Reset POR_EL1 on exception entry Kevin Brodsky
2026-05-05 15:42   ` Kevin Brodsky [this message]
2026-02-27 17:54 ` [PATCH v6 08/30] arm64: Context-switch POR_EL1 Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 09/30] arm64: Initialize POR_EL1 register on cpu_resume() Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 10/30] arm64: Enable kpkeys Kevin Brodsky
2026-02-27 17:54 ` [PATCH v6 11/30] memblock: Move INIT_MEMBLOCK_* macros to header Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 12/30] set_memory: Introduce arch_has_pte_only_direct_map() Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 13/30] mm: kpkeys: Introduce kpkeys_hardened_pgtables feature Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 14/30] mm: kpkeys: Introduce block-based page table allocator Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 15/30] mm: kpkeys: Handle splitting of linear map Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 16/30] mm: kpkeys: Defer early call to set_memory_pkey() Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 17/30] mm: kpkeys: Add shrinker for block pgtable allocator Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 18/30] mm: kpkeys: Introduce early page table allocator Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 19/30] mm: kpkeys: Introduce hook for protecting static page tables Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 20/30] arm64: cpufeature: Add helper to directly probe CPU for POE support Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 21/30] arm64: set_memory: Implement arch_has_pte_only_direct_map() Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 22/30] arm64: kpkeys: Support KPKEYS_LVL_PGTABLES Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 23/30] arm64: kpkeys: Ensure the linear map can be modified Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 24/30] arm64: kpkeys: Handle splitting of linear map Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 25/30] arm64: kpkeys: Protect early page tables Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 26/30] arm64: kpkeys: Protect init_pg_dir Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 27/30] arm64: kpkeys: Guard page table writes Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 28/30] arm64: kpkeys: Batch KPKEYS_LVL_PGTABLES switches Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 29/30] arm64: kpkeys: Enable kpkeys_hardened_pgtables support Kevin Brodsky
2026-02-27 17:55 ` [PATCH v6 30/30] mm: Add basic tests for kpkeys_hardened_pgtables Kevin Brodsky
2026-03-02  9:27 ` [PATCH v6 00/30] pkeys-based page table hardening Kevin Brodsky
2026-04-15 12:48 ` David Hildenbrand (Arm)
2026-04-15 15:48   ` Kevin Brodsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7dc9485d-a822-494d-9384-4a973c782c11@arm.com \
    --to=kevin.brodsky@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=broonie@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=ira.weiny@intel.com \
    --cc=jannh@google.com \
    --cc=jeffxu@chromium.org \
    --cc=joey.gouly@arm.com \
    --cc=kees@kernel.org \
    --cc=linus.walleij@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=luto@kernel.org \
    --cc=maz@kernel.org \
    --cc=mbland@motorola.com \
    --cc=peterz@infradead.org \
    --cc=pierre.langlois@arm.com \
    --cc=qperret@google.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=yang@os.amperecomputing.com \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox