* [RFC PATCH 00/16] pkeys-based page table hardening
@ 2024-12-06 10:10 Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 01/16] mm: Introduce kpkeys Kevin Brodsky
` (16 more replies)
0 siblings, 17 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:10 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
This is a proposal to leverage protection keys (pkeys) to harden
critical kernel data, by making it mostly read-only. The series includes
a simple framework called "kpkeys" to manipulate pkeys for in-kernel use,
as well as a page table hardening feature based on that framework
(kpkeys_hardened_pgtables). Both are implemented on arm64 as a proof of
concept, but they are designed to be compatible with any architecture
implementing pkeys.
The proposed approach is a typical use of pkeys: the data to protect is
mapped with a given pkey P, and the pkey register is initially configured
to grant read-only access to P. Where the protected data needs to be
written to, the pkey register is temporarily switched to grant write
access to P on the current CPU.
The key fact this approach relies on is that the target data is
only written to via a limited and well-defined API. This makes it
possible to explicitly switch the pkey register where needed, without
introducing excessively invasive changes, and only for a small amount of
trusted code.
Page tables were chosen as they are a popular (and critical) target for
attacks, but there are of course many others - this is only a starting
point (see section "Further use-cases"). It has become more and more
common for accesses to such target data to be mediated by a hypervisor
in vendor kernels; the hope is that kpkeys can provide much of that
protection in a simpler manner. No benchmarking has been performed at
this stage, but the runtime overhead should also be lower (though likely
not negligible).
# kpkeys
The use of pkeys involves two separate mechanisms: assigning a pkey to
pages, and defining the pkeys -> permissions mapping via the pkey
register. This is implemented through the following interface:
- Pages in the linear mapping are assigned a pkey using set_memory_pkey().
This is sufficient for this series, but of course higher-level
interfaces can be introduced later to ask allocators to return pages
marked with a given pkey. It should also be possible to extend this to
vmalloc() if needed.
- The pkey register is configured based on a *kpkeys level*. kpkeys
levels are simple integers that correspond to a given configuration,
for instance:
KPKEYS_LVL_DEFAULT:
RW access to KPKEYS_PKEY_DEFAULT
RO access to any other KPKEYS_PKEY_*
KPKEYS_LVL_<FEAT>:
RW access to KPKEYS_PKEY_DEFAULT
RW access to KPKEYS_PKEY_<FEAT>
RO access to any other KPKEYS_PKEY_*
Only pkeys that are managed by the kpkeys framework are impacted;
permissions for other pkeys are left unchanged (this allows for other
schemes using pkeys to be used in parallel, and arch-specific use of
certain pkeys).
The kpkeys level is changed by calling kpkeys_set_level(), setting the
pkey register accordingly and returning the original value. A
subsequent call to kpkeys_restore_pkey_reg() restores the kpkeys
level. The numeric value of KPKEYS_LVL_* (kpkeys level) is purely
symbolic and thus generic, however each architecture is free to define
KPKEYS_PKEY_* (pkey value).
# kpkeys_hardened_pgtables
The kpkeys_hardened_pgtables feature uses the interface above to make
the (kernel and user) page tables read-only by default, enabling write
access only in helpers such as set_pte(). One complication is that those
helpers as well as page table allocators are used very early, before
kpkeys become available. Enabling kpkeys_hardened_pgtables, if and when
kpkeys become available, is therefore done as follows:
1. A static key is turned on. This enables a transition to
KPKEYS_LVL_PGTABLES in all helpers writing to page tables, and also
impacts page table allocators (see step 3).
2. All pages holding kernel page tables are set to KPKEYS_PKEY_PGTABLES.
This ensures they can only be written when runnning at
KPKEYS_LVL_PGTABLES.
3. Page table allocators set the returned pages to KPKEYS_PKEY_PGTABLES
(and the pkey is reset upon freeing). This ensures that all page
tables are mapped with that privileged pkey.
# Threat model
The proposed scheme aims at mitigating data-only attacks (e.g.
use-after-free/cross-cache attacks). In other words, it is assumed that
control flow is not corrupted, and that the attacker does not achieve
arbitrary code execution. Nothing prevents the pkey register from being
set to its most permissive state - the assumption is that the register
is only modified on legitimate code paths.
A few related notes:
- Functions that set the pkey register are all implemented inline.
Besides performance considerations, this is meant to avoid creating
a function that can be used as a straightforward gadget to set the
pkey register to an arbitrary value.
- kpkeys_set_level() only accepts a compile-time constant as argument,
as a variable could be manipulated by an attacker. This could be
relaxed but it seems unlikely that a variable kpkeys level would be
needed in practice.
# Further use-cases
It should be possible to harden various targets using kpkeys, including:
- struct cred (enforcing a "mostly read-only" state once committed)
- fixmap (occasionally used even after early boot, e.g.
set_swapper_pgd() in arch/arm64/mm/mmu.c)
- SELinux state (e.g. struct selinux_state::initialized)
... and many others.
kpkeys could also be used to strengthen the confidentiality of secret
data by making it completely inaccessible by default, and granting
read-only or read-write access as needed. This requires such data to be
rarely accessed (or via a limited interface only). One example on arm64
is the pointer authentication keys in thread_struct, whose leakage to
userspace would lead to pointer authentication being easily defeated.
# This series
The series is composed of two parts:
- The kpkeys framework (patch 1-7). The main API is introduced in
<linux/kpkeys.h>, and it is implemented on arm64 using the POE
(Permission Overlay Extension) feature.
- The kpkeys_hardened_pgtables feature (patch 8-16). <linux/kpkeys.h> is
extended with an API to set page table pages to a given pkey and a
guard object to switch kpkeys level accordingly, both gated on a
static key. This is then used in generic and arm64 pgtable handling
code as needed. Finally a simple KUnit-based test suite is added to
demonstrate the page table protection.
The arm64 implementation should be considered a proof of concept only.
The enablement of POE for in-kernel use is incomplete; in particular
POR_EL1 (pkey register) should be reset on exception entry and restored
on exception return.
# Performance
No particular efforts were made to optimise the use of kpkeys at this
stage (and no benchmarking was performed either). There are two obvious
low-hanging fruits in the kpkeys_hardened_pgtables feature:
- Always switching kpkeys level in leaf helpers such as set_pte() can be
very inefficient if many page table entries are updated in a row. Some
sort of batching may be desirable.
- On arm64 specifically, the page table helpers typically perform an
expensive ISB (Instruction Synchronisation Barrier) after writing to
page tables. Since most of the cost of switching the arm64 pkey
register (POR_EL1) comes from the following ISB, the overhead incurred
by kpkeys_restore_pkey_reg() would be significantly reduced by merging
its ISB with the pgtable helper's. That would however require more
invasive changes, beyond simply adding a guard object.
# Open questions
A few aspects in this RFC that are debatable and/or worth discussing:
- There is currently no restriction on how kpkeys levels map to pkeys
permissions. A typical approach is to allocate one pkey per level and
make it writable at that level only. As the number of levels
increases, we may however run out of pkeys, especially on arm64 (just
8 pkeys with POE). Depending on the use-cases, it may be acceptable to
use the same pkey for the data associated to multiple levels.
Another potential concern is that a given piece of code may require
write access to multiple privileged pkeys. This could be addressed by
introducing a notion of hierarchy in trust levels, where Tn is able to
write to memory owned by Tm if n >= m, for instance.
- kpkeys_set_level() and kpkeys_restore_pkey_reg() are not symmetric:
the former takes a kpkeys level and returns a pkey register value, to
be consumed by the latter. It would be more intuitive to manipulate
kpkeys levels only. However this assumes that there is a 1:1 mapping
between kpkeys levels and pkey register values, while in principle
the mapping is 1:n (certain pkeys may be used outside the kpkeys
framework).
- An architecture that supports kpkeys is expected to select
CONFIG_ARCH_HAS_KPKEYS and always enable them if available - there is
no CONFIG_KPKEYS to control this behaviour. Since this creates no
significant overhead (at least on arm64), it seemed better to keep it
simple. Each hardening feature does have its own option and arch
opt-in if needed (CONFIG_KPKEYS_HARDENED_PGTABLES,
CONFIG_ARCH_HAS_KPKEYS_HARDENED_PGTABLES).
Any comment or feedback will be highly appreciated, be it on the
high-level approach or implementation choices!
- Kevin
---
Cc: aruna.ramakrishna@oracle.com
Cc: broonie@kernel.org
Cc: catalin.marinas@arm.com
Cc: dave.hansen@linux.intel.com
Cc: jannh@google.com
Cc: jeffxu@chromium.org
Cc: joey.gouly@arm.com
Cc: kees@kernel.org
Cc: maz@kernel.org
Cc: pierre.langlois@arm.com
Cc: qperret@google.com
Cc: ryan.roberts@arm.com
Cc: will@kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: x86@kernel.org
---
Kevin Brodsky (16):
mm: Introduce kpkeys
set_memory: Introduce set_memory_pkey() stub
arm64: mm: Enable overlays for all EL1 indirect permissions
arm64: Introduce por_set_pkey_perms() helper
arm64: Implement asm/kpkeys.h using POE
arm64: set_memory: Implement set_memory_pkey()
arm64: Enable kpkeys
mm: Introduce kernel_pgtables_set_pkey()
mm: Introduce kpkeys_hardened_pgtables
mm: Map page tables with privileged pkey
arm64: kpkeys: Support KPKEYS_LVL_PGTABLES
arm64: mm: Map p4d/pgd with privileged pkey
arm64: mm: Reset pkey in __tlb_remove_table()
arm64: mm: Guard page table writes with kpkeys
arm64: Enable kpkeys_hardened_pgtables support
mm: Add basic tests for kpkeys_hardened_pgtables
arch/arm64/Kconfig | 2 +
arch/arm64/include/asm/kpkeys.h | 45 +++++++++
arch/arm64/include/asm/pgalloc.h | 21 +++-
arch/arm64/include/asm/pgtable-prot.h | 16 ++--
arch/arm64/include/asm/pgtable.h | 19 +++-
arch/arm64/include/asm/por.h | 9 ++
arch/arm64/include/asm/set_memory.h | 4 +
arch/arm64/include/asm/tlb.h | 6 +-
arch/arm64/kernel/cpufeature.c | 5 +-
arch/arm64/kernel/smp.c | 2 +
arch/arm64/mm/fault.c | 2 +
arch/arm64/mm/mmu.c | 28 ++----
arch/arm64/mm/pageattr.c | 21 ++++
arch/arm64/mm/pgd.c | 30 +++++-
include/asm-generic/kpkeys.h | 21 ++++
include/linux/kpkeys.h | 132 ++++++++++++++++++++++++++
include/linux/mm.h | 22 ++++-
include/linux/set_memory.h | 7 ++
mm/Kconfig | 5 +
mm/Makefile | 2 +
mm/kpkeys_hardened_pgtables.c | 17 ++++
mm/kpkeys_hardened_pgtables_test.c | 71 ++++++++++++++
mm/memory.c | 130 +++++++++++++++++++++++++
security/Kconfig.hardening | 24 +++++
24 files changed, 604 insertions(+), 37 deletions(-)
create mode 100644 arch/arm64/include/asm/kpkeys.h
create mode 100644 include/asm-generic/kpkeys.h
create mode 100644 include/linux/kpkeys.h
create mode 100644 mm/kpkeys_hardened_pgtables.c
create mode 100644 mm/kpkeys_hardened_pgtables_test.c
--
2.47.0
^ permalink raw reply [flat|nested] 29+ messages in thread
* [RFC PATCH 01/16] mm: Introduce kpkeys
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
@ 2024-12-06 10:10 ` Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 02/16] set_memory: Introduce set_memory_pkey() stub Kevin Brodsky
` (15 subsequent siblings)
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:10 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
kpkeys is a simple framework to enable the use of protection keys
(pkeys) to harden the kernel itself. This patch introduces the basic
API in <linux/kpkeys.h>: a couple of functions to set and restore
the pkey register and a macro to define guard objects.
kpkeys introduces a new concept on top of pkeys: the kpkeys level.
Each level is associated to a set of permissions for the pkeys
managed by the kpkeys framework. kpkeys_set_level(lvl) sets those
permissions according to lvl, and returns the original pkey
register, to be later restored by kpkeys_restore_pkey_reg(). To
start with, only KPKEYS_LVL_DEFAULT is available, which is meant
to grant RW access to KPKEYS_PKEY_DEFAULT (i.e. all memory since
this is the only available pkey for now).
Because each architecture implementing pkeys uses a different
representation for the pkey register, and may reserve certain pkeys
for specific uses, support for kpkeys must be explicitly indicated
by selecting ARCH_HAS_KPKEYS and defining the following functions in
<asm/kpkeys.h>, in addition to the macros provided in
<asm-generic/kpkeys.h>:
- arch_kpkeys_set_level()
- arch_kpkeys_restore_pkey_reg()
- arch_kpkeys_enabled()
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
include/asm-generic/kpkeys.h | 9 +++++
include/linux/kpkeys.h | 67 ++++++++++++++++++++++++++++++++++++
mm/Kconfig | 2 ++
3 files changed, 78 insertions(+)
create mode 100644 include/asm-generic/kpkeys.h
create mode 100644 include/linux/kpkeys.h
diff --git a/include/asm-generic/kpkeys.h b/include/asm-generic/kpkeys.h
new file mode 100644
index 000000000000..3404ce249757
--- /dev/null
+++ b/include/asm-generic/kpkeys.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_GENERIC_KPKEYS_H
+#define __ASM_GENERIC_KPKEYS_H
+
+#ifndef KPKEYS_PKEY_DEFAULT
+#define KPKEYS_PKEY_DEFAULT 0
+#endif
+
+#endif /* __ASM_GENERIC_KPKEYS_H */
diff --git a/include/linux/kpkeys.h b/include/linux/kpkeys.h
new file mode 100644
index 000000000000..bcc063425926
--- /dev/null
+++ b/include/linux/kpkeys.h
@@ -0,0 +1,67 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _LINUX_KPKEYS_H
+#define _LINUX_KPKEYS_H
+
+#include <linux/bug.h>
+#include <linux/cleanup.h>
+#include <linux/set_memory.h>
+
+#define KPKEYS_LVL_DEFAULT 0
+
+#define KPKEYS_LVL_MIN KPKEYS_LVL_DEFAULT
+#define KPKEYS_LVL_MAX KPKEYS_LVL_DEFAULT
+
+#define KPKEYS_GUARD(_name, set_level, restore_pkey_reg) \
+ __DEFINE_CLASS_IS_CONDITIONAL(_name, false); \
+ DEFINE_CLASS(_name, u64, \
+ restore_pkey_reg, set_level, void); \
+ static inline void *class_##_name##_lock_ptr(u64 *_T) \
+ { return _T; }
+
+#ifdef CONFIG_ARCH_HAS_KPKEYS
+
+#include <asm/kpkeys.h>
+
+/**
+ * kpkeys_set_level() - switch kpkeys level
+ * @level: the level to switch to
+ *
+ * Switches the kpkeys level to the specified value. @level must be a
+ * compile-time constant. The arch-specific pkey register will be updated
+ * accordingly, and the original value returned.
+ *
+ * Return: the original pkey register value.
+ */
+static inline u64 kpkeys_set_level(int level)
+{
+ BUILD_BUG_ON_MSG(!__builtin_constant_p(level),
+ "kpkeys_set_level() only takes constant levels");
+ BUILD_BUG_ON_MSG(level < KPKEYS_LVL_MIN || level > KPKEYS_LVL_MAX,
+ "Invalid level passed to kpkeys_set_level()");
+
+ return arch_kpkeys_set_level(level);
+}
+
+/**
+ * kpkeys_restore_pkey_reg() - restores a pkey register value
+ * @pkey_reg: the pkey register value to restore
+ *
+ * This function is meant to be passed the value returned by kpkeys_set_level(),
+ * in order to restore the pkey register to its original value (thus restoring
+ * the original kpkeys level).
+ */
+static inline void kpkeys_restore_pkey_reg(u64 pkey_reg)
+{
+ arch_kpkeys_restore_pkey_reg(pkey_reg);
+}
+
+#else /* CONFIG_ARCH_HAS_KPKEYS */
+
+static inline bool arch_kpkeys_enabled(void)
+{
+ return false;
+}
+
+#endif /* CONFIG_ARCH_HAS_KPKEYS */
+
+#endif /* _LINUX_KPKEYS_H */
diff --git a/mm/Kconfig b/mm/Kconfig
index 84000b016808..f51dffca9d4e 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1104,6 +1104,8 @@ config ARCH_USES_HIGH_VMA_FLAGS
bool
config ARCH_HAS_PKEYS
bool
+config ARCH_HAS_KPKEYS
+ bool
config ARCH_USES_PG_ARCH_2
bool
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 02/16] set_memory: Introduce set_memory_pkey() stub
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 01/16] mm: Introduce kpkeys Kevin Brodsky
@ 2024-12-06 10:10 ` Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 03/16] arm64: mm: Enable overlays for all EL1 indirect permissions Kevin Brodsky
` (14 subsequent siblings)
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:10 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
Introduce a new function, set_memory_pkey(), which sets the
protection key (pkey) of pages in the specified linear mapping
range. Architectures implementing kernel pkeys (kpkeys) must
provide a suitable implementation; an empty stub is added as
fallback.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
include/linux/set_memory.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/include/linux/set_memory.h b/include/linux/set_memory.h
index 3030d9245f5a..7b3a8bfde3c6 100644
--- a/include/linux/set_memory.h
+++ b/include/linux/set_memory.h
@@ -84,4 +84,11 @@ static inline int set_memory_decrypted(unsigned long addr, int numpages)
}
#endif /* CONFIG_ARCH_HAS_MEM_ENCRYPT */
+#ifndef CONFIG_ARCH_HAS_KPKEYS
+static inline int set_memory_pkey(unsigned long addr, int numpages, int pkey)
+{
+ return 0;
+}
+#endif
+
#endif /* _LINUX_SET_MEMORY_H_ */
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 03/16] arm64: mm: Enable overlays for all EL1 indirect permissions
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 01/16] mm: Introduce kpkeys Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 02/16] set_memory: Introduce set_memory_pkey() stub Kevin Brodsky
@ 2024-12-06 10:10 ` Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 04/16] arm64: Introduce por_set_pkey_perms() helper Kevin Brodsky
` (13 subsequent siblings)
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:10 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
In preparation of using POE inside the kernel, enable "Overlay
applied" for all stage 1 base permissions in PIR_EL1. This ensures
that the permissions set in POR_EL1 affect all kernel mappings.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/include/asm/pgtable-prot.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 9f9cf13bbd95..a1c4f3837ea9 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -174,13 +174,13 @@ static inline bool __pure lpa2_is_enabled(void)
PIRx_ELx_PERM(pte_pi_index(_PAGE_GCS), PIE_NONE_O) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_GCS_RO), PIE_NONE_O) | \
PIRx_ELx_PERM(pte_pi_index(_PAGE_EXECONLY), PIE_NONE_O) | \
- PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_R) | \
- PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC), PIE_RW) | \
- PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY), PIE_R) | \
- PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED), PIE_RW) | \
- PIRx_ELx_PERM(pte_pi_index(_PAGE_KERNEL_ROX), PIE_RX) | \
- PIRx_ELx_PERM(pte_pi_index(_PAGE_KERNEL_EXEC), PIE_RWX) | \
- PIRx_ELx_PERM(pte_pi_index(_PAGE_KERNEL_RO), PIE_R) | \
- PIRx_ELx_PERM(pte_pi_index(_PAGE_KERNEL), PIE_RW))
+ PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY_EXEC), PIE_R_O) | \
+ PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED_EXEC), PIE_RW_O) | \
+ PIRx_ELx_PERM(pte_pi_index(_PAGE_READONLY), PIE_R_O) | \
+ PIRx_ELx_PERM(pte_pi_index(_PAGE_SHARED), PIE_RW_O) | \
+ PIRx_ELx_PERM(pte_pi_index(_PAGE_KERNEL_ROX), PIE_RX_O) | \
+ PIRx_ELx_PERM(pte_pi_index(_PAGE_KERNEL_EXEC), PIE_RWX_O) | \
+ PIRx_ELx_PERM(pte_pi_index(_PAGE_KERNEL_RO), PIE_R_O) | \
+ PIRx_ELx_PERM(pte_pi_index(_PAGE_KERNEL), PIE_RW_O))
#endif /* __ASM_PGTABLE_PROT_H */
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 04/16] arm64: Introduce por_set_pkey_perms() helper
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (2 preceding siblings ...)
2024-12-06 10:10 ` [RFC PATCH 03/16] arm64: mm: Enable overlays for all EL1 indirect permissions Kevin Brodsky
@ 2024-12-06 10:10 ` Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 05/16] arm64: Implement asm/kpkeys.h using POE Kevin Brodsky
` (12 subsequent siblings)
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:10 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
Introduce a helper that sets the permissions of a given pkey
(POIndex) in the POR_ELx format, and make use of it in
arch_set_user_pkey_access().
Also ensure that <asm/sysreg.h> is included in asm/por.h to provide
the POE_* definitions.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/include/asm/por.h | 9 +++++++++
arch/arm64/mm/mmu.c | 28 ++++++++++------------------
2 files changed, 19 insertions(+), 18 deletions(-)
diff --git a/arch/arm64/include/asm/por.h b/arch/arm64/include/asm/por.h
index e06e9f473675..7f0d73980cce 100644
--- a/arch/arm64/include/asm/por.h
+++ b/arch/arm64/include/asm/por.h
@@ -6,6 +6,8 @@
#ifndef _ASM_ARM64_POR_H
#define _ASM_ARM64_POR_H
+#include <asm/sysreg.h>
+
#define POR_BITS_PER_PKEY 4
#define POR_ELx_IDX(por_elx, idx) (((por_elx) >> ((idx) * POR_BITS_PER_PKEY)) & 0xf)
@@ -30,4 +32,11 @@ static inline bool por_elx_allows_exec(u64 por, u8 pkey)
return perm & POE_X;
}
+static inline u64 por_set_pkey_perms(u64 por, u8 pkey, u64 perms)
+{
+ u64 shift = pkey * POR_BITS_PER_PKEY;
+
+ return (por & ~(POE_MASK << shift)) | (perms << shift);
+}
+
#endif /* _ASM_ARM64_POR_H */
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index e2739b69e11b..20e0390ee382 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1554,9 +1554,8 @@ void __cpu_replace_ttbr1(pgd_t *pgdp, bool cnp)
#ifdef CONFIG_ARCH_HAS_PKEYS
int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long init_val)
{
- u64 new_por = POE_RXW;
- u64 old_por;
- u64 pkey_shift;
+ u64 new_perms;
+ u64 por;
if (!system_supports_poe())
return -ENOSPC;
@@ -1570,26 +1569,19 @@ int arch_set_user_pkey_access(struct task_struct *tsk, int pkey, unsigned long i
return -EINVAL;
/* Set the bits we need in POR: */
- new_por = POE_RXW;
+ new_perms = POE_RXW;
if (init_val & PKEY_DISABLE_WRITE)
- new_por &= ~POE_W;
+ new_perms &= ~POE_W;
if (init_val & PKEY_DISABLE_ACCESS)
- new_por &= ~POE_RW;
+ new_perms &= ~POE_RW;
if (init_val & PKEY_DISABLE_READ)
- new_por &= ~POE_R;
+ new_perms &= ~POE_R;
if (init_val & PKEY_DISABLE_EXECUTE)
- new_por &= ~POE_X;
+ new_perms &= ~POE_X;
- /* Shift the bits in to the correct place in POR for pkey: */
- pkey_shift = pkey * POR_BITS_PER_PKEY;
- new_por <<= pkey_shift;
-
- /* Get old POR and mask off any old bits in place: */
- old_por = read_sysreg_s(SYS_POR_EL0);
- old_por &= ~(POE_MASK << pkey_shift);
-
- /* Write old part along with new part: */
- write_sysreg_s(old_por | new_por, SYS_POR_EL0);
+ por = read_sysreg_s(SYS_POR_EL0);
+ por = por_set_pkey_perms(por, pkey, new_perms);
+ write_sysreg_s(por, SYS_POR_EL0);
return 0;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 05/16] arm64: Implement asm/kpkeys.h using POE
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (3 preceding siblings ...)
2024-12-06 10:10 ` [RFC PATCH 04/16] arm64: Introduce por_set_pkey_perms() helper Kevin Brodsky
@ 2024-12-06 10:10 ` Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 06/16] arm64: set_memory: Implement set_memory_pkey() Kevin Brodsky
` (11 subsequent siblings)
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:10 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
Implement the kpkeys interface if CONFIG_ARM64_POE is enabled.
The permissions for KPKEYS_PKEY_DEFAULT (pkey 0) are set to RWX as
this pkey is also used for code mappings.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/include/asm/kpkeys.h | 43 +++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)
create mode 100644 arch/arm64/include/asm/kpkeys.h
diff --git a/arch/arm64/include/asm/kpkeys.h b/arch/arm64/include/asm/kpkeys.h
new file mode 100644
index 000000000000..e17f6df41873
--- /dev/null
+++ b/arch/arm64/include/asm/kpkeys.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __ASM_KPKEYS_H
+#define __ASM_KPKEYS_H
+
+#include <asm/barrier.h>
+#include <asm/cpufeature.h>
+#include <asm/por.h>
+
+#include <asm-generic/kpkeys.h>
+
+static inline bool arch_kpkeys_enabled(void)
+{
+ return system_supports_poe();
+}
+
+#ifdef CONFIG_ARM64_POE
+
+static inline u64 por_set_kpkeys_level(u64 por, int level)
+{
+ por = por_set_pkey_perms(por, KPKEYS_PKEY_DEFAULT, POE_RXW);
+
+ return por;
+}
+
+static inline int arch_kpkeys_set_level(int level)
+{
+ u64 prev_por = read_sysreg_s(SYS_POR_EL1);
+
+ write_sysreg_s(por_set_kpkeys_level(prev_por, level), SYS_POR_EL1);
+ isb();
+
+ return prev_por;
+}
+
+static inline void arch_kpkeys_restore_pkey_reg(u64 pkey_reg)
+{
+ write_sysreg_s(pkey_reg, SYS_POR_EL1);
+ isb();
+}
+
+#endif /* CONFIG_ARM64_POE */
+
+#endif /* __ASM_KPKEYS_H */
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 06/16] arm64: set_memory: Implement set_memory_pkey()
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (4 preceding siblings ...)
2024-12-06 10:10 ` [RFC PATCH 05/16] arm64: Implement asm/kpkeys.h using POE Kevin Brodsky
@ 2024-12-06 10:11 ` Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 07/16] arm64: Enable kpkeys Kevin Brodsky
` (10 subsequent siblings)
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:11 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
Implement set_memory_pkey() using POE if supported.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/include/asm/set_memory.h | 4 ++++
arch/arm64/mm/pageattr.c | 21 +++++++++++++++++++++
2 files changed, 25 insertions(+)
diff --git a/arch/arm64/include/asm/set_memory.h b/arch/arm64/include/asm/set_memory.h
index 90f61b17275e..b6cd6de34abf 100644
--- a/arch/arm64/include/asm/set_memory.h
+++ b/arch/arm64/include/asm/set_memory.h
@@ -19,4 +19,8 @@ bool kernel_page_present(struct page *page);
int set_memory_encrypted(unsigned long addr, int numpages);
int set_memory_decrypted(unsigned long addr, int numpages);
+#ifdef CONFIG_ARCH_HAS_KPKEYS
+int set_memory_pkey(unsigned long addr, int numpages, int pkey);
+#endif
+
#endif /* _ASM_ARM64_SET_MEMORY_H */
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index 39fd1f7ff02a..3b8fec532b18 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -292,6 +292,27 @@ int set_direct_map_valid_noflush(struct page *page, unsigned nr, bool valid)
return set_memory_valid(addr, nr, valid);
}
+#ifdef CONFIG_ARCH_HAS_KPKEYS
+int set_memory_pkey(unsigned long addr, int numpages, int pkey)
+{
+ unsigned long set_prot = 0;
+
+ if (!system_supports_poe())
+ return 0;
+
+ if (!__is_lm_address(addr))
+ return -EINVAL;
+
+ set_prot |= pkey & BIT(0) ? PTE_PO_IDX_0 : 0;
+ set_prot |= pkey & BIT(1) ? PTE_PO_IDX_1 : 0;
+ set_prot |= pkey & BIT(2) ? PTE_PO_IDX_2 : 0;
+
+ return __change_memory_common(addr, PAGE_SIZE * numpages,
+ __pgprot(set_prot),
+ __pgprot(PTE_PO_IDX_MASK));
+}
+#endif
+
#ifdef CONFIG_DEBUG_PAGEALLOC
/*
* This is - apart from the return value - doing the same
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 07/16] arm64: Enable kpkeys
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (5 preceding siblings ...)
2024-12-06 10:11 ` [RFC PATCH 06/16] arm64: set_memory: Implement set_memory_pkey() Kevin Brodsky
@ 2024-12-06 10:11 ` Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 08/16] mm: Introduce kernel_pgtables_set_pkey() Kevin Brodsky
` (9 subsequent siblings)
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:11 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
This is the final step to enable kpkeys on arm64. We enable
POE at EL1 by setting TCR2_EL1.POE, and initialise POR_EL1 so that
it enables access to the default pkey/POIndex (default kpkeys
level). An ISB is added so that POE restrictions are enforced
immediately.
Having done this, we can now select ARCH_HAS_KPKEYS if ARM64_POE is
enabled.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/cpufeature.c | 5 ++++-
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 100570a048c5..f35964641c1a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2183,6 +2183,7 @@ config ARM64_POE
def_bool y
select ARCH_USES_HIGH_VMA_FLAGS
select ARCH_HAS_PKEYS
+ select ARCH_HAS_KPKEYS
help
The Permission Overlay Extension is used to implement Memory
Protection Keys. Memory Protection Keys provides a mechanism for
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 6ce71f444ed8..3925bf04fb2f 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -75,6 +75,7 @@
#include <linux/cpu.h>
#include <linux/kasan.h>
#include <linux/percpu.h>
+#include <linux/kpkeys.h>
#include <asm/cpu.h>
#include <asm/cpufeature.h>
@@ -2376,8 +2377,10 @@ static void cpu_enable_mops(const struct arm64_cpu_capabilities *__unused)
#ifdef CONFIG_ARM64_POE
static void cpu_enable_poe(const struct arm64_cpu_capabilities *__unused)
{
- sysreg_clear_set(REG_TCR2_EL1, 0, TCR2_EL1x_E0POE);
+ write_sysreg_s(por_set_kpkeys_level(0, KPKEYS_LVL_DEFAULT), SYS_POR_EL1);
+ sysreg_clear_set(REG_TCR2_EL1, 0, TCR2_EL1x_E0POE | TCR2_EL1x_POE);
sysreg_clear_set(CPACR_EL1, 0, CPACR_ELx_E0POE);
+ isb();
}
#endif
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 08/16] mm: Introduce kernel_pgtables_set_pkey()
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (6 preceding siblings ...)
2024-12-06 10:11 ` [RFC PATCH 07/16] arm64: Enable kpkeys Kevin Brodsky
@ 2024-12-06 10:11 ` Kevin Brodsky
2024-12-09 10:03 ` Peter Zijlstra
2024-12-06 10:11 ` [RFC PATCH 09/16] mm: Introduce kpkeys_hardened_pgtables Kevin Brodsky
` (8 subsequent siblings)
16 siblings, 1 reply; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:11 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
kernel_pgtables_set_pkey() allows setting the pkey of all page table
pages in swapper_pg_dir, recursively. This will be needed by
kpkeys_hardened_pgtables, as it relies on all PTPs being mapped with
a non-default pkey. Those initial kernel page tables cannot
practically be assigned a non-default pkey right when they are
allocated, so mutating them during (early) boot is required.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
It feels that some sort of locking is called for in
kernel_pgtables_set_pkey(), but I couldn't figure out what would be
appropriate.
---
include/linux/mm.h | 2 +
mm/memory.c | 130 +++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 132 insertions(+)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index c39c4945946c..683e883dae77 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4179,4 +4179,6 @@ int arch_get_shadow_stack_status(struct task_struct *t, unsigned long __user *st
int arch_set_shadow_stack_status(struct task_struct *t, unsigned long status);
int arch_lock_shadow_stack_status(struct task_struct *t, unsigned long status);
+int kernel_pgtables_set_pkey(int pkey);
+
#endif /* _LINUX_MM_H */
diff --git a/mm/memory.c b/mm/memory.c
index 75c2dfd04f72..278ddf9f6249 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -76,6 +76,7 @@
#include <linux/ptrace.h>
#include <linux/vmalloc.h>
#include <linux/sched/sysctl.h>
+#include <linux/kpkeys.h>
#include <trace/events/kmem.h>
@@ -6974,3 +6975,132 @@ void vma_pgtable_walk_end(struct vm_area_struct *vma)
if (is_vm_hugetlb_page(vma))
hugetlb_vma_unlock_read(vma);
}
+
+static int set_page_pkey(void *p, int pkey)
+{
+ unsigned long addr = (unsigned long)p;
+
+ /*
+ * swapper_pg_dir itself will be made read-only by mark_rodata_ro()
+ * so there is no point in changing its pkey.
+ */
+ if (p == swapper_pg_dir)
+ return 0;
+
+ return set_memory_pkey(addr, 1, pkey);
+}
+
+static int set_pkey_pte(pmd_t *pmd, int pkey)
+{
+ pte_t *pte;
+ int err;
+
+ pte = pte_offset_kernel(pmd, 0);
+ err = set_page_pkey(pte, pkey);
+
+ return err;
+}
+
+static int set_pkey_pmd(pud_t *pud, int pkey)
+{
+ pmd_t *pmd;
+ int i, err = 0;
+
+ pmd = pmd_offset(pud, 0);
+
+ err = set_page_pkey(pmd, pkey);
+ if (err)
+ return err;
+
+ for (i = 0; i < PTRS_PER_PMD; i++) {
+ if (pmd_none(pmd[i]) || pmd_bad(pmd[i]) || pmd_leaf(pmd[i]))
+ continue;
+ err = set_pkey_pte(&pmd[i], pkey);
+ if (err)
+ break;
+ }
+
+ return err;
+}
+
+static int set_pkey_pud(p4d_t *p4d, int pkey)
+{
+ pud_t *pud;
+ int i, err = 0;
+
+ if (mm_pmd_folded(&init_mm))
+ return set_pkey_pmd((pud_t *)p4d, pkey);
+
+ pud = pud_offset(p4d, 0);
+
+ err = set_page_pkey(pud, pkey);
+ if (err)
+ return err;
+
+ for (i = 0; i < PTRS_PER_PUD; i++) {
+ if (pud_none(pud[i]) || pud_bad(pud[i]) || pud_leaf(pud[i]))
+ continue;
+ err = set_pkey_pmd(&pud[i], pkey);
+ if (err)
+ break;
+ }
+
+ return err;
+}
+
+static int set_pkey_p4d(pgd_t *pgd, int pkey)
+{
+ p4d_t *p4d;
+ int i, err = 0;
+
+ if (mm_pud_folded(&init_mm))
+ return set_pkey_pud((p4d_t *)pgd, pkey);
+
+ p4d = p4d_offset(pgd, 0);
+
+ err = set_page_pkey(p4d, pkey);
+ if (err)
+ return err;
+
+ for (i = 0; i < PTRS_PER_P4D; i++) {
+ if (p4d_none(p4d[i]) || p4d_bad(p4d[i]) || p4d_leaf(p4d[i]))
+ continue;
+ err = set_pkey_pud(&p4d[i], pkey);
+ if (err)
+ break;
+ }
+
+ return err;
+}
+
+/**
+ * kernel_pgtables_set_pkey - set pkey for all kernel page table pages
+ * @pkey: pkey to set the page table pages to
+ *
+ * Walks swapper_pg_dir setting the protection key of every page table page (at
+ * all levels) to @pkey. swapper_pg_dir itself is left untouched as it is
+ * expected to be mapped read-only by mark_rodata_ro().
+ *
+ * No-op if the architecture does not support kpkeys.
+ */
+int kernel_pgtables_set_pkey(int pkey)
+{
+ pgd_t *pgd = swapper_pg_dir;
+ int i, err = 0;
+
+ if (!arch_kpkeys_enabled())
+ return 0;
+
+ if (mm_p4d_folded(&init_mm))
+ return set_pkey_p4d(pgd, pkey);
+
+ for (i = 0; i < PTRS_PER_PGD; i++) {
+ if (pgd_none(pgd[i]) || pgd_bad(pgd[i]) || pgd_leaf(pgd[i]))
+ continue;
+ err = set_pkey_p4d(&pgd[i], pkey);
+ if (err)
+ break;
+ }
+
+ return err;
+}
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 09/16] mm: Introduce kpkeys_hardened_pgtables
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (7 preceding siblings ...)
2024-12-06 10:11 ` [RFC PATCH 08/16] mm: Introduce kernel_pgtables_set_pkey() Kevin Brodsky
@ 2024-12-06 10:11 ` Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 10/16] mm: Map page tables with privileged pkey Kevin Brodsky
` (7 subsequent siblings)
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:11 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
kpkeys_hardened_pgtables is a hardening feature based on kpkeys. It
aims to prevent the corruption of page tables by: 1. mapping all
page table pages, both kernel and user, with a privileged pkey
(KPKEYS_PKEY_PGTABLES), and 2. granting write access to that pkey
only when running at a higher kpkeys level (KPKEYS_LVL_PGTABLES).
The feature is exposed as CONFIG_KPKEYS_HARDENED_PGTABLES; it
requires explicit architecture opt-in by selecting
ARCH_HAS_KPKEYS_HARDENED_PGTABLES, since much of the page table
handling is arch-specific.
This patch introduces an API to modify the PTPs' pkey and switch
kpkeys level using a guard object. Because this API is going to be
called from low-level pgtable helpers (setters, allocators), it must
be inactive on boot and explicitly switched on if and when kpkeys
become available. A static key is used for that purpose; it is the
responsibility of each architecture supporting
kpkeys_hardened_pgtables to call kpkeys_hardened_pgtables_enable()
as early as possible to switch on that static key. The initial
kernel page tables are also walked to set their pkey, since they
have already been allocated at that point.
The definition of the kpkeys_hardened_pgtables guard class does not
use the static key on the restore path to avoid mismatched
set/restore pairs. Indeed, enabling the static key itself involves
modifying page tables, and it is thus possible that the guard object
is created when the static key appears as false, and destroyed when it
appears as true. To avoid this situation, we reserve an invalid value
for the pkey register and use it to disable the restore path.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
include/asm-generic/kpkeys.h | 12 +++++++
include/linux/kpkeys.h | 67 ++++++++++++++++++++++++++++++++++-
mm/Kconfig | 3 ++
mm/Makefile | 1 +
mm/kpkeys_hardened_pgtables.c | 17 +++++++++
security/Kconfig.hardening | 12 +++++++
6 files changed, 111 insertions(+), 1 deletion(-)
create mode 100644 mm/kpkeys_hardened_pgtables.c
diff --git a/include/asm-generic/kpkeys.h b/include/asm-generic/kpkeys.h
index 3404ce249757..cec92334a9f3 100644
--- a/include/asm-generic/kpkeys.h
+++ b/include/asm-generic/kpkeys.h
@@ -2,8 +2,20 @@
#ifndef __ASM_GENERIC_KPKEYS_H
#define __ASM_GENERIC_KPKEYS_H
+#ifndef KPKEYS_PKEY_PGTABLES
+#define KPKEYS_PKEY_PGTABLES 1
+#endif
+
#ifndef KPKEYS_PKEY_DEFAULT
#define KPKEYS_PKEY_DEFAULT 0
#endif
+/*
+ * Represents a pkey register value that cannot be used, typically disabling
+ * access to all keys.
+ */
+#ifndef KPKEYS_PKEY_REG_INVAL
+#define KPKEYS_PKEY_REG_INVAL 0
+#endif
+
#endif /* __ASM_GENERIC_KPKEYS_H */
diff --git a/include/linux/kpkeys.h b/include/linux/kpkeys.h
index bcc063425926..bd3e0f36d2d6 100644
--- a/include/linux/kpkeys.h
+++ b/include/linux/kpkeys.h
@@ -4,12 +4,14 @@
#include <linux/bug.h>
#include <linux/cleanup.h>
+#include <linux/jump_label.h>
#include <linux/set_memory.h>
#define KPKEYS_LVL_DEFAULT 0
+#define KPKEYS_LVL_PGTABLES 1
#define KPKEYS_LVL_MIN KPKEYS_LVL_DEFAULT
-#define KPKEYS_LVL_MAX KPKEYS_LVL_DEFAULT
+#define KPKEYS_LVL_MAX KPKEYS_LVL_PGTABLES
#define KPKEYS_GUARD(_name, set_level, restore_pkey_reg) \
__DEFINE_CLASS_IS_CONDITIONAL(_name, false); \
@@ -64,4 +66,67 @@ static inline bool arch_kpkeys_enabled(void)
#endif /* CONFIG_ARCH_HAS_KPKEYS */
+#ifdef CONFIG_KPKEYS_HARDENED_PGTABLES
+
+DECLARE_STATIC_KEY_FALSE(kpkeys_hardened_pgtables_enabled);
+
+/*
+ * Use guard(kpkeys_hardened_pgtables)() to temporarily grant write access
+ * to page tables.
+ */
+KPKEYS_GUARD(kpkeys_hardened_pgtables,
+ static_branch_unlikely(&kpkeys_hardened_pgtables_enabled) ?
+ kpkeys_set_level(KPKEYS_LVL_PGTABLES) :
+ KPKEYS_PKEY_REG_INVAL,
+ _T != KPKEYS_PKEY_REG_INVAL ?
+ kpkeys_restore_pkey_reg(_T) :
+ (void)0)
+
+static inline int kpkeys_protect_pgtable_memory(unsigned long addr, int numpages)
+{
+ int ret = 0;
+
+ if (static_branch_unlikely(&kpkeys_hardened_pgtables_enabled))
+ ret = set_memory_pkey(addr, numpages, KPKEYS_PKEY_PGTABLES);
+
+ WARN_ON(ret);
+ return ret;
+}
+
+static inline int kpkeys_unprotect_pgtable_memory(unsigned long addr, int numpages)
+{
+ int ret = 0;
+
+ if (static_branch_unlikely(&kpkeys_hardened_pgtables_enabled))
+ ret = set_memory_pkey(addr, numpages, KPKEYS_PKEY_DEFAULT);
+
+ WARN_ON(ret);
+ return ret;
+}
+
+/*
+ * Enables kpkeys_hardened_pgtables and switches existing kernel page tables to
+ * a privileged pkey (KPKEYS_PKEY_PGTABLES).
+ *
+ * Should be called as early as possible by architecture code, after (k)pkeys
+ * are initialised and before any user task is spawned.
+ */
+void kpkeys_hardened_pgtables_enable(void);
+
+#else /* CONFIG_KPKEYS_HARDENED_PGTABLES */
+
+KPKEYS_GUARD(kpkeys_hardened_pgtables, 0, (void)_T)
+
+static inline int kpkeys_protect_pgtable_memory(unsigned long addr, int numpages)
+{
+ return 0;
+}
+static inline int kpkeys_unprotect_pgtable_memory(unsigned long addr, int numpages)
+{
+ return 0;
+}
+static inline void kpkeys_hardened_pgtables_enable(void) {}
+
+#endif /* CONFIG_KPKEYS_HARDENED_PGTABLES */
+
#endif /* _LINUX_KPKEYS_H */
diff --git a/mm/Kconfig b/mm/Kconfig
index f51dffca9d4e..07ae45a1395f 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1106,6 +1106,9 @@ config ARCH_HAS_PKEYS
bool
config ARCH_HAS_KPKEYS
bool
+# ARCH_HAS_KPKEYS must be selected when selecting this option
+config ARCH_HAS_KPKEYS_HARDENED_PGTABLES
+ bool
config ARCH_USES_PG_ARCH_2
bool
diff --git a/mm/Makefile b/mm/Makefile
index dba52bb0da8a..ffe799c1c897 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -146,3 +146,4 @@ obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o
obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
obj-$(CONFIG_EXECMEM) += execmem.o
obj-$(CONFIG_TMPFS_QUOTA) += shmem_quota.o
+obj-$(CONFIG_KPKEYS_HARDENED_PGTABLES) += kpkeys_hardened_pgtables.o
diff --git a/mm/kpkeys_hardened_pgtables.c b/mm/kpkeys_hardened_pgtables.c
new file mode 100644
index 000000000000..e26fc20bdafe
--- /dev/null
+++ b/mm/kpkeys_hardened_pgtables.c
@@ -0,0 +1,17 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/mm.h>
+#include <linux/kpkeys.h>
+
+DEFINE_STATIC_KEY_FALSE(kpkeys_hardened_pgtables_enabled);
+
+void __init kpkeys_hardened_pgtables_enable(void)
+{
+ int ret;
+
+ if (!arch_kpkeys_enabled())
+ return;
+
+ static_branch_enable(&kpkeys_hardened_pgtables_enabled);
+ ret = kernel_pgtables_set_pkey(KPKEYS_PKEY_PGTABLES);
+ WARN_ON(ret);
+}
diff --git a/security/Kconfig.hardening b/security/Kconfig.hardening
index c9d5ca3d8d08..95f93f1d4055 100644
--- a/security/Kconfig.hardening
+++ b/security/Kconfig.hardening
@@ -300,6 +300,18 @@ config BUG_ON_DATA_CORRUPTION
If unsure, say N.
+config KPKEYS_HARDENED_PGTABLES
+ bool "Harden page tables using kernel pkeys"
+ depends on ARCH_HAS_KPKEYS_HARDENED_PGTABLES
+ help
+ This option makes all page tables mostly read-only by
+ allocating them with a non-default protection key (pkey) and
+ only enabling write access to that pkey in routines that are
+ expected to write to page table entries.
+
+ This option has no effect if the system does not support
+ kernel pkeys.
+
endmenu
config CC_HAS_RANDSTRUCT
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 10/16] mm: Map page tables with privileged pkey
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (8 preceding siblings ...)
2024-12-06 10:11 ` [RFC PATCH 09/16] mm: Introduce kpkeys_hardened_pgtables Kevin Brodsky
@ 2024-12-06 10:11 ` Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 11/16] arm64: kpkeys: Support KPKEYS_LVL_PGTABLES Kevin Brodsky
` (6 subsequent siblings)
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:11 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
If CONFIG_KPKEYS_HARDENED_PGTABLES is enabled, map allocated page
table pages using a privileged pkey (KPKEYS_PKEY_PGTABLES), so that
page tables can only be written under guard(kpkeys_hardened_pgtables).
This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled
(default).
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
include/linux/mm.h | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 683e883dae77..4fb25454ba85 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -31,6 +31,7 @@
#include <linux/kasan.h>
#include <linux/memremap.h>
#include <linux/slab.h>
+#include <linux/kpkeys.h>
struct mempolicy;
struct anon_vma;
@@ -2895,7 +2896,19 @@ static inline bool pagetable_is_reserved(struct ptdesc *pt)
*/
static inline struct ptdesc *pagetable_alloc_noprof(gfp_t gfp, unsigned int order)
{
- struct page *page = alloc_pages_noprof(gfp | __GFP_COMP, order);
+ struct page *page;
+ int ret;
+
+ page = alloc_pages_noprof(gfp | __GFP_COMP, order);
+ if (!page)
+ return NULL;
+
+ ret = kpkeys_protect_pgtable_memory((unsigned long)page_address(page),
+ 1 << order);
+ if (ret) {
+ __free_pages(page, order);
+ return NULL;
+ }
return page_ptdesc(page);
}
@@ -2911,8 +2924,11 @@ static inline struct ptdesc *pagetable_alloc_noprof(gfp_t gfp, unsigned int orde
static inline void pagetable_free(struct ptdesc *pt)
{
struct page *page = ptdesc_page(pt);
+ unsigned int order = compound_order(page);
- __free_pages(page, compound_order(page));
+ kpkeys_unprotect_pgtable_memory((unsigned long)page_address(page),
+ 1 << order);
+ __free_pages(page, order);
}
#if defined(CONFIG_SPLIT_PTE_PTLOCKS)
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 11/16] arm64: kpkeys: Support KPKEYS_LVL_PGTABLES
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (9 preceding siblings ...)
2024-12-06 10:11 ` [RFC PATCH 10/16] mm: Map page tables with privileged pkey Kevin Brodsky
@ 2024-12-06 10:11 ` Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 12/16] arm64: mm: Map p4d/pgd with privileged pkey Kevin Brodsky
` (5 subsequent siblings)
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:11 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
Enable RW access to KPKEYS_PKEY_PGTABLES (used to map page table
pages) if switching to KPKEYS_LVL_PGTABLES, otherwise only grant RO
access.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/include/asm/kpkeys.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/include/asm/kpkeys.h b/arch/arm64/include/asm/kpkeys.h
index e17f6df41873..4854e1f3babd 100644
--- a/arch/arm64/include/asm/kpkeys.h
+++ b/arch/arm64/include/asm/kpkeys.h
@@ -18,6 +18,8 @@ static inline bool arch_kpkeys_enabled(void)
static inline u64 por_set_kpkeys_level(u64 por, int level)
{
por = por_set_pkey_perms(por, KPKEYS_PKEY_DEFAULT, POE_RXW);
+ por = por_set_pkey_perms(por, KPKEYS_PKEY_PGTABLES,
+ level == KPKEYS_LVL_PGTABLES ? POE_RW : POE_R);
return por;
}
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 12/16] arm64: mm: Map p4d/pgd with privileged pkey
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (10 preceding siblings ...)
2024-12-06 10:11 ` [RFC PATCH 11/16] arm64: kpkeys: Support KPKEYS_LVL_PGTABLES Kevin Brodsky
@ 2024-12-06 10:11 ` Kevin Brodsky
2024-12-09 10:24 ` Peter Zijlstra
2024-12-06 10:11 ` [RFC PATCH 13/16] arm64: mm: Reset pkey in __tlb_remove_table() Kevin Brodsky
` (4 subsequent siblings)
16 siblings, 1 reply; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:11 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
If CONFIG_KPKEYS_HARDENED_PGTABLES is enabled, map p4d/pgd pages
using a privileged pkey (KPKEYS_PKEY_PGTABLES), so that they can
only be written under guard(kpkeys_hardened_pgtables).
The case where pgd is not page-sized is not currently handled -
this is pending support for pkeys in kmem_cache.
This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled
(default).
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/include/asm/pgalloc.h | 21 ++++++++++++++++++---
arch/arm64/mm/pgd.c | 30 ++++++++++++++++++++++++++++--
2 files changed, 46 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index e75422864d1b..c006aecd6ba5 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -88,18 +88,33 @@ static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
{
gfp_t gfp = GFP_PGTABLE_USER;
+ int ret;
if (mm == &init_mm)
gfp = GFP_PGTABLE_KERNEL;
- return (p4d_t *)get_zeroed_page(gfp);
+
+ addr = get_zeroed_page(gfp);
+ if (!addr)
+ return NULL;
+
+ ret = kpkeys_protect_pgtable_memory(addr, 1);
+ if (ret) {
+ free_page(addr);
+ return NULL;
+ }
+
+ return (p4d_t *)addr;
}
static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
{
+ unsigned long addr = (unsigned long)p4d;
+
if (!pgtable_l5_enabled())
return;
- BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
- free_page((unsigned long)p4d);
+ BUG_ON(addr & (PAGE_SIZE-1));
+ kpkeys_unprotect_pgtable_memory(addr, 1);
+ free_page(addr);
}
#define __p4d_free_tlb(tlb, p4d, addr) p4d_free((tlb)->mm, p4d)
diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c
index 0c501cabc238..3577cc1821af 100644
--- a/arch/arm64/mm/pgd.c
+++ b/arch/arm64/mm/pgd.c
@@ -28,12 +28,38 @@ static bool pgdir_is_page_size(void)
return false;
}
+static pgd_t *pgd_page_alloc(gfp_t gfp)
+{
+ unsigned long addr;
+ int ret;
+
+ addr = __get_free_page(gfp);
+ if (!addr)
+ return NULL;
+
+ ret = kpkeys_protect_pgtable_memory(addr, 1);
+ if (ret) {
+ free_page(addr);
+ return NULL;
+ }
+
+ return (pgd_t *)addr;
+}
+
+static void pgd_page_free(pgd_t *pgd)
+{
+ unsigned long addr = (unsigned long)pgd;
+
+ kpkeys_unprotect_pgtable_memory(addr, 1);
+ free_page(addr);
+}
+
pgd_t *pgd_alloc(struct mm_struct *mm)
{
gfp_t gfp = GFP_PGTABLE_USER;
if (pgdir_is_page_size())
- return (pgd_t *)__get_free_page(gfp);
+ return pgd_page_alloc(gfp);
else
return kmem_cache_alloc(pgd_cache, gfp);
}
@@ -41,7 +67,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
void pgd_free(struct mm_struct *mm, pgd_t *pgd)
{
if (pgdir_is_page_size())
- free_page((unsigned long)pgd);
+ pgd_page_free(pgd);
else
kmem_cache_free(pgd_cache, pgd);
}
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 13/16] arm64: mm: Reset pkey in __tlb_remove_table()
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (11 preceding siblings ...)
2024-12-06 10:11 ` [RFC PATCH 12/16] arm64: mm: Map p4d/pgd with privileged pkey Kevin Brodsky
@ 2024-12-06 10:11 ` Kevin Brodsky
2024-12-09 10:29 ` Peter Zijlstra
2024-12-06 10:11 ` [RFC PATCH 14/16] arm64: mm: Guard page table writes with kpkeys Kevin Brodsky
` (3 subsequent siblings)
16 siblings, 1 reply; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:11 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
Page table pages are typically freed via tlb_remove_table() and
friends. Ensure that the linear mapping for those pages is reset to
the default pkey when CONFIG_KPKEYS_HARDENED_PGTABLES is enabled.
This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled
(default).
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/include/asm/tlb.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index a947c6e784ed..d1611ffa6d91 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -10,10 +10,14 @@
#include <linux/pagemap.h>
#include <linux/swap.h>
+#include <linux/kpkeys.h>
static inline void __tlb_remove_table(void *_table)
{
- free_page_and_swap_cache((struct page *)_table);
+ struct page *page = (struct page *)_table;
+
+ kpkeys_unprotect_pgtable_memory((unsigned long)page_address(page), 1);
+ free_page_and_swap_cache(page);
}
#define tlb_flush tlb_flush
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 14/16] arm64: mm: Guard page table writes with kpkeys
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (12 preceding siblings ...)
2024-12-06 10:11 ` [RFC PATCH 13/16] arm64: mm: Reset pkey in __tlb_remove_table() Kevin Brodsky
@ 2024-12-06 10:11 ` Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 15/16] arm64: Enable kpkeys_hardened_pgtables support Kevin Brodsky
` (2 subsequent siblings)
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:11 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
When CONFIG_KPKEYS_HARDENED_PGTABLES is enabled, page tables (both
user and kernel) are mapped with a privileged pkey in the linear
mapping. As a result, they can only be written under the
kpkeys_hardened_pgtables guard, which sets POR_EL1 appropriately to
allow such writes.
Use this guard wherever page tables genuinely need to be written,
keeping its scope as small as possible (so that POR_EL1 is reset as
fast as possible). Where atomics are involved, the guard's scope
encompasses the whole loop to avoid switching POR_EL1 unnecessarily.
This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled
(default).
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/include/asm/pgtable.h | 19 +++++++++++++++++--
arch/arm64/mm/fault.c | 2 ++
2 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 6986345b537a..5f9d748f08ee 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -39,6 +39,7 @@
#include <linux/mm_types.h>
#include <linux/sched.h>
#include <linux/page_table_check.h>
+#include <linux/kpkeys.h>
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
@@ -314,6 +315,7 @@ static inline pte_t pte_clear_uffd_wp(pte_t pte)
static inline void __set_pte_nosync(pte_t *ptep, pte_t pte)
{
+ guard(kpkeys_hardened_pgtables)();
WRITE_ONCE(*ptep, pte);
}
@@ -758,6 +760,7 @@ static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
}
#endif /* __PAGETABLE_PMD_FOLDED */
+ guard(kpkeys_hardened_pgtables)();
WRITE_ONCE(*pmdp, pmd);
if (pmd_valid(pmd)) {
@@ -825,6 +828,7 @@ static inline void set_pud(pud_t *pudp, pud_t pud)
return;
}
+ guard(kpkeys_hardened_pgtables)();
WRITE_ONCE(*pudp, pud);
if (pud_valid(pud)) {
@@ -906,6 +910,7 @@ static inline void set_p4d(p4d_t *p4dp, p4d_t p4d)
return;
}
+ guard(kpkeys_hardened_pgtables)();
WRITE_ONCE(*p4dp, p4d);
dsb(ishst);
isb();
@@ -1033,6 +1038,7 @@ static inline void set_pgd(pgd_t *pgdp, pgd_t pgd)
return;
}
+ guard(kpkeys_hardened_pgtables)();
WRITE_ONCE(*pgdp, pgd);
dsb(ishst);
isb();
@@ -1233,6 +1239,7 @@ static inline int __ptep_test_and_clear_young(struct vm_area_struct *vma,
{
pte_t old_pte, pte;
+ guard(kpkeys_hardened_pgtables)();
pte = __ptep_get(ptep);
do {
old_pte = pte;
@@ -1279,7 +1286,10 @@ static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma,
static inline pte_t __ptep_get_and_clear(struct mm_struct *mm,
unsigned long address, pte_t *ptep)
{
- pte_t pte = __pte(xchg_relaxed(&pte_val(*ptep), 0));
+ pte_t pte;
+
+ scoped_guard(kpkeys_hardened_pgtables)
+ pte = __pte(xchg_relaxed(&pte_val(*ptep), 0));
page_table_check_pte_clear(mm, pte);
@@ -1322,7 +1332,10 @@ static inline pte_t __get_and_clear_full_ptes(struct mm_struct *mm,
static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
unsigned long address, pmd_t *pmdp)
{
- pmd_t pmd = __pmd(xchg_relaxed(&pmd_val(*pmdp), 0));
+ pmd_t pmd;
+
+ scoped_guard(kpkeys_hardened_pgtables)
+ pmd = __pmd(xchg_relaxed(&pmd_val(*pmdp), 0));
page_table_check_pmd_clear(mm, pmd);
@@ -1336,6 +1349,7 @@ static inline void ___ptep_set_wrprotect(struct mm_struct *mm,
{
pte_t old_pte;
+ guard(kpkeys_hardened_pgtables)();
do {
old_pte = pte;
pte = pte_wrprotect(pte);
@@ -1416,6 +1430,7 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
unsigned long address, pmd_t *pmdp, pmd_t pmd)
{
page_table_check_pmd_set(vma->vm_mm, pmdp, pmd);
+ guard(kpkeys_hardened_pgtables)();
return __pmd(xchg_relaxed(&pmd_val(*pmdp), pmd_val(pmd)));
}
#endif
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index ef63651099a9..ab45047155b9 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -220,6 +220,8 @@ int __ptep_set_access_flags(struct vm_area_struct *vma,
if (pte_same(pte, entry))
return 0;
+ guard(kpkeys_hardened_pgtables)();
+
/* only preserve the access flags and write permission */
pte_val(entry) &= PTE_RDONLY | PTE_AF | PTE_WRITE | PTE_DIRTY;
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 15/16] arm64: Enable kpkeys_hardened_pgtables support
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (13 preceding siblings ...)
2024-12-06 10:11 ` [RFC PATCH 14/16] arm64: mm: Guard page table writes with kpkeys Kevin Brodsky
@ 2024-12-06 10:11 ` Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 16/16] mm: Add basic tests for kpkeys_hardened_pgtables Kevin Brodsky
2024-12-06 19:14 ` [RFC PATCH 00/16] pkeys-based page table hardening Jann Horn
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:11 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
kpkeys_hardened_pgtables should be enabled as early as possible (if
selected). It does however require kpkeys being available, which
means on arm64 POE being detected and enabled. POE is a boot
feature, so calling kpkeys_hardened_pgtables_enable() just after
setup_boot_cpu_features() in smp_prepare_boot_cpu() is the best we
can do.
With that done, all the bits are in place and we can advertise
support for kpkeys_hardened_pgtables by selecting
ARCH_HAS_KPKEYS_HARDENED_PGTABLES if ARM64_POE is enabled.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/smp.c | 2 ++
2 files changed, 3 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f35964641c1a..dac2f9a64826 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2184,6 +2184,7 @@ config ARM64_POE
select ARCH_USES_HIGH_VMA_FLAGS
select ARCH_HAS_PKEYS
select ARCH_HAS_KPKEYS
+ select ARCH_HAS_KPKEYS_HARDENED_PGTABLES
help
The Permission Overlay Extension is used to implement Memory
Protection Keys. Memory Protection Keys provides a mechanism for
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 3b3f6b56e733..074cab55f9db 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -35,6 +35,7 @@
#include <linux/kgdb.h>
#include <linux/kvm_host.h>
#include <linux/nmi.h>
+#include <linux/kpkeys.h>
#include <asm/alternative.h>
#include <asm/atomic.h>
@@ -468,6 +469,7 @@ void __init smp_prepare_boot_cpu(void)
if (system_uses_irq_prio_masking())
init_gic_priority_masking();
+ kpkeys_hardened_pgtables_enable();
kasan_init_hw_tags();
/* Init percpu seeds for random tags after cpus are set up. */
kasan_init_sw_tags();
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* [RFC PATCH 16/16] mm: Add basic tests for kpkeys_hardened_pgtables
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (14 preceding siblings ...)
2024-12-06 10:11 ` [RFC PATCH 15/16] arm64: Enable kpkeys_hardened_pgtables support Kevin Brodsky
@ 2024-12-06 10:11 ` Kevin Brodsky
2024-12-06 19:14 ` [RFC PATCH 00/16] pkeys-based page table hardening Jann Horn
16 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-06 10:11 UTC (permalink / raw)
To: linux-hardening
Cc: linux-kernel, Kevin Brodsky, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
Add basic tests for the kpkeys_hardened_pgtables feature: try to
perform a direct write to some kernel and user page table entry and
ensure it fails.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
mm/Makefile | 1 +
mm/kpkeys_hardened_pgtables_test.c | 71 ++++++++++++++++++++++++++++++
security/Kconfig.hardening | 12 +++++
3 files changed, 84 insertions(+)
create mode 100644 mm/kpkeys_hardened_pgtables_test.c
diff --git a/mm/Makefile b/mm/Makefile
index ffe799c1c897..49ac16ae6875 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -147,3 +147,4 @@ obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o
obj-$(CONFIG_EXECMEM) += execmem.o
obj-$(CONFIG_TMPFS_QUOTA) += shmem_quota.o
obj-$(CONFIG_KPKEYS_HARDENED_PGTABLES) += kpkeys_hardened_pgtables.o
+obj-$(CONFIG_KPKEYS_HARDENED_PGTABLES_TEST) += kpkeys_hardened_pgtables_test.o
diff --git a/mm/kpkeys_hardened_pgtables_test.c b/mm/kpkeys_hardened_pgtables_test.c
new file mode 100644
index 000000000000..37b6ffaa55e6
--- /dev/null
+++ b/mm/kpkeys_hardened_pgtables_test.c
@@ -0,0 +1,71 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <kunit/test.h>
+#include <linux/pgtable.h>
+#include <linux/mman.h>
+
+static void write_kernel_pte(struct kunit *test)
+{
+ pte_t *ptep;
+ pte_t pte;
+ int ret;
+
+ /*
+ * The choice of address is mostly arbitrary - we just need a page
+ * that is definitely mapped, such as the current function.
+ */
+ ptep = virt_to_kpte((unsigned long)&write_kernel_pte);
+ KUNIT_ASSERT_NOT_NULL_MSG(test, ptep, "Failed to get PTE");
+
+ pte = ptep_get(ptep);
+ pte = set_pte_bit(pte, __pgprot(PTE_WRITE));
+ ret = copy_to_kernel_nofault(ptep, &pte, sizeof(pte));
+ KUNIT_EXPECT_EQ_MSG(test, ret, -EFAULT,
+ "Direct PTE write wasn't prevented");
+}
+
+static void write_user_pmd(struct kunit *test)
+{
+ pmd_t *pmdp;
+ pmd_t pmd;
+ unsigned long uaddr;
+ int ret;
+
+ uaddr = kunit_vm_mmap(test, NULL, 0, PAGE_SIZE, PROT_READ,
+ MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, 0);
+ KUNIT_ASSERT_NE_MSG(test, uaddr, 0, "Could not create userspace mm");
+
+ /* We passed MAP_POPULATE so a PMD should already be allocated */
+ pmdp = pmd_off(current->mm, uaddr);
+ KUNIT_ASSERT_NOT_NULL_MSG(test, pmdp, "Failed to get PMD");
+
+ pmd = pmdp_get(pmdp);
+ pmd = set_pmd_bit(pmd, __pgprot(PROT_SECT_NORMAL));
+ ret = copy_to_kernel_nofault(pmdp, &pmd, sizeof(pmd));
+ KUNIT_EXPECT_EQ_MSG(test, ret, -EFAULT,
+ "Direct PMD write wasn't prevented");
+}
+
+static int kpkeys_hardened_pgtables_suite_init(struct kunit_suite *suite)
+{
+ if (!arch_kpkeys_enabled()) {
+ pr_err("Cannot run kpkeys_hardened_pgtables tests: kpkeys are not supported\n");
+ return 1;
+ }
+
+ return 0;
+}
+
+static struct kunit_case kpkeys_hardened_pgtables_test_cases[] = {
+ KUNIT_CASE(write_kernel_pte),
+ KUNIT_CASE(write_user_pmd),
+ {}
+};
+
+static struct kunit_suite kpkeys_hardened_pgtables_test_suite = {
+ .name = "Hardened pgtables using kpkeys",
+ .test_cases = kpkeys_hardened_pgtables_test_cases,
+ .suite_init = kpkeys_hardened_pgtables_suite_init,
+};
+kunit_test_suite(kpkeys_hardened_pgtables_test_suite);
+
+MODULE_LICENSE("GPL");
diff --git a/security/Kconfig.hardening b/security/Kconfig.hardening
index 95f93f1d4055..8bc5d7235f6d 100644
--- a/security/Kconfig.hardening
+++ b/security/Kconfig.hardening
@@ -312,6 +312,18 @@ config KPKEYS_HARDENED_PGTABLES
This option has no effect if the system does not support
kernel pkeys.
+config KPKEYS_HARDENED_PGTABLES_TEST
+ tristate "KUnit tests for kpkeys_hardened_pgtables" if !KUNIT_ALL_TESTS
+ depends on KPKEYS_HARDENED_PGTABLES
+ depends on KUNIT
+ default KUNIT_ALL_TESTS
+ help
+ Enable this option to check that the kpkeys_hardened_pgtables feature
+ functions as intended, i.e. prevents arbitrary writes to user and
+ kernel page tables.
+
+ If unsure, say N.
+
endmenu
config CC_HAS_RANDSTRUCT
--
2.47.0
^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 00/16] pkeys-based page table hardening
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
` (15 preceding siblings ...)
2024-12-06 10:11 ` [RFC PATCH 16/16] mm: Add basic tests for kpkeys_hardened_pgtables Kevin Brodsky
@ 2024-12-06 19:14 ` Jann Horn
2024-12-09 12:57 ` Kevin Brodsky
16 siblings, 1 reply; 29+ messages in thread
From: Jann Horn @ 2024-12-06 19:14 UTC (permalink / raw)
To: Kevin Brodsky
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jeffxu, joey.gouly, kees, maz,
pierre.langlois, qperret, ryan.roberts, will, linux-arm-kernel,
x86
On Fri, Dec 6, 2024 at 11:13 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
> This is a proposal to leverage protection keys (pkeys) to harden
> critical kernel data, by making it mostly read-only. The series includes
> a simple framework called "kpkeys" to manipulate pkeys for in-kernel use,
> as well as a page table hardening feature based on that framework
> (kpkeys_hardened_pgtables). Both are implemented on arm64 as a proof of
> concept, but they are designed to be compatible with any architecture
> implementing pkeys.
>
> The proposed approach is a typical use of pkeys: the data to protect is
> mapped with a given pkey P, and the pkey register is initially configured
> to grant read-only access to P. Where the protected data needs to be
> written to, the pkey register is temporarily switched to grant write
> access to P on the current CPU.
>
> The key fact this approach relies on is that the target data is
> only written to via a limited and well-defined API. This makes it
> possible to explicitly switch the pkey register where needed, without
> introducing excessively invasive changes, and only for a small amount of
> trusted code.
>
> Page tables were chosen as they are a popular (and critical) target for
> attacks, but there are of course many others - this is only a starting
> point (see section "Further use-cases"). It has become more and more
> common for accesses to such target data to be mediated by a hypervisor
> in vendor kernels; the hope is that kpkeys can provide much of that
> protection in a simpler manner. No benchmarking has been performed at
> this stage, but the runtime overhead should also be lower (though likely
> not negligible).
Yeah, it isn't great that vendor kernels contain such invasive changes...
I guess one difference between this approach and a hypervisor-based
approach is that a hypervisor that uses a second layer of page tables
can also prevent access through aliasing mappings, while pkeys only
prevent access through a specific mapping? (Like if an attacker
managed to add a page that is mapped into userspace to a page
allocator freelist, allocate this page as a page table, and use the
userspace mapping to write into this page table. But I guess whether
that is an issue depends on the threat model.)
> # kpkeys_hardened_pgtables
>
> The kpkeys_hardened_pgtables feature uses the interface above to make
> the (kernel and user) page tables read-only by default, enabling write
> access only in helpers such as set_pte(). One complication is that those
> helpers as well as page table allocators are used very early, before
> kpkeys become available. Enabling kpkeys_hardened_pgtables, if and when
> kpkeys become available, is therefore done as follows:
>
> 1. A static key is turned on. This enables a transition to
> KPKEYS_LVL_PGTABLES in all helpers writing to page tables, and also
> impacts page table allocators (see step 3).
>
> 2. All pages holding kernel page tables are set to KPKEYS_PKEY_PGTABLES.
> This ensures they can only be written when runnning at
> KPKEYS_LVL_PGTABLES.
>
> 3. Page table allocators set the returned pages to KPKEYS_PKEY_PGTABLES
> (and the pkey is reset upon freeing). This ensures that all page
> tables are mapped with that privileged pkey.
>
> # Threat model
>
> The proposed scheme aims at mitigating data-only attacks (e.g.
> use-after-free/cross-cache attacks). In other words, it is assumed that
> control flow is not corrupted, and that the attacker does not achieve
> arbitrary code execution. Nothing prevents the pkey register from being
> set to its most permissive state - the assumption is that the register
> is only modified on legitimate code paths.
Is the threat model that the attacker has already achieved full
read/write access to unprotected kernel data and should be stopped
from gaining write access to protected data? Or is the threat model
that the attacker has achieved some limited corruption, and this
series is intended to make it harder to either gain write access to
protected data or achieve full read/write access to unprotected data?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 08/16] mm: Introduce kernel_pgtables_set_pkey()
2024-12-06 10:11 ` [RFC PATCH 08/16] mm: Introduce kernel_pgtables_set_pkey() Kevin Brodsky
@ 2024-12-09 10:03 ` Peter Zijlstra
2024-12-10 9:27 ` Kevin Brodsky
0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2024-12-09 10:03 UTC (permalink / raw)
To: Kevin Brodsky
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
On Fri, Dec 06, 2024 at 10:11:02AM +0000, Kevin Brodsky wrote:
> kernel_pgtables_set_pkey() allows setting the pkey of all page table
> pages in swapper_pg_dir, recursively. This will be needed by
> kpkeys_hardened_pgtables, as it relies on all PTPs being mapped with
> a non-default pkey. Those initial kernel page tables cannot
> practically be assigned a non-default pkey right when they are
> allocated, so mutating them during (early) boot is required.
>
> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
> ---
>
> It feels that some sort of locking is called for in
> kernel_pgtables_set_pkey(), but I couldn't figure out what would be
> appropriate.
init_mm.page_table_lock is typically the one used to serialize kernel
page tables IIRC.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 12/16] arm64: mm: Map p4d/pgd with privileged pkey
2024-12-06 10:11 ` [RFC PATCH 12/16] arm64: mm: Map p4d/pgd with privileged pkey Kevin Brodsky
@ 2024-12-09 10:24 ` Peter Zijlstra
2024-12-10 9:27 ` Kevin Brodsky
0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2024-12-09 10:24 UTC (permalink / raw)
To: Kevin Brodsky
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
On Fri, Dec 06, 2024 at 10:11:06AM +0000, Kevin Brodsky wrote:
> If CONFIG_KPKEYS_HARDENED_PGTABLES is enabled, map p4d/pgd pages
> using a privileged pkey (KPKEYS_PKEY_PGTABLES), so that they can
> only be written under guard(kpkeys_hardened_pgtables).
>
> The case where pgd is not page-sized is not currently handled -
> this is pending support for pkeys in kmem_cache.
>
> This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled
> (default).
Should not this live in pagetable_*_[cd]tor() in generic code?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 13/16] arm64: mm: Reset pkey in __tlb_remove_table()
2024-12-06 10:11 ` [RFC PATCH 13/16] arm64: mm: Reset pkey in __tlb_remove_table() Kevin Brodsky
@ 2024-12-09 10:29 ` Peter Zijlstra
2024-12-10 9:28 ` Kevin Brodsky
0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2024-12-09 10:29 UTC (permalink / raw)
To: Kevin Brodsky
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
On Fri, Dec 06, 2024 at 10:11:07AM +0000, Kevin Brodsky wrote:
> Page table pages are typically freed via tlb_remove_table() and
> friends. Ensure that the linear mapping for those pages is reset to
> the default pkey when CONFIG_KPKEYS_HARDENED_PGTABLES is enabled.
>
> This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled
> (default).
>
> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
> ---
> arch/arm64/include/asm/tlb.h | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
> index a947c6e784ed..d1611ffa6d91 100644
> --- a/arch/arm64/include/asm/tlb.h
> +++ b/arch/arm64/include/asm/tlb.h
> @@ -10,10 +10,14 @@
>
> #include <linux/pagemap.h>
> #include <linux/swap.h>
> +#include <linux/kpkeys.h>
>
> static inline void __tlb_remove_table(void *_table)
> {
> - free_page_and_swap_cache((struct page *)_table);
> + struct page *page = (struct page *)_table;
> +
> + kpkeys_unprotect_pgtable_memory((unsigned long)page_address(page), 1);
> + free_page_and_swap_cache(page);
> }
Same as for the others, perhaps stick this in generic code instead of in
the arch code?
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 00/16] pkeys-based page table hardening
2024-12-06 19:14 ` [RFC PATCH 00/16] pkeys-based page table hardening Jann Horn
@ 2024-12-09 12:57 ` Kevin Brodsky
0 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-09 12:57 UTC (permalink / raw)
To: Jann Horn
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jeffxu, joey.gouly, kees, maz,
pierre.langlois, qperret, ryan.roberts, will, linux-arm-kernel,
x86
On 06/12/2024 20:14, Jann Horn wrote:
> On Fri, Dec 6, 2024 at 11:13 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
>> [...]
>>
>> Page tables were chosen as they are a popular (and critical) target for
>> attacks, but there are of course many others - this is only a starting
>> point (see section "Further use-cases"). It has become more and more
>> common for accesses to such target data to be mediated by a hypervisor
>> in vendor kernels; the hope is that kpkeys can provide much of that
>> protection in a simpler manner. No benchmarking has been performed at
>> this stage, but the runtime overhead should also be lower (though likely
>> not negligible).
> Yeah, it isn't great that vendor kernels contain such invasive changes...
>
> I guess one difference between this approach and a hypervisor-based
> approach is that a hypervisor that uses a second layer of page tables
> can also prevent access through aliasing mappings, while pkeys only
> prevent access through a specific mapping? (Like if an attacker
> managed to add a page that is mapped into userspace to a page
> allocator freelist, allocate this page as a page table, and use the
> userspace mapping to write into this page table. But I guess whether
> that is an issue depends on the threat model.)
Yes, that's correct. If an attacker is able to modify page tables then
kpkeys are easily defeated. (kpkeys_hardened_pgtables does mitigate
precisely that, though.) On the topic of aliases, it's worth noting that
this isn't an issue with page table pages (only the linear mapping is
used), but if we wanted to assigning a pkey to vmalloc areas we'd also
have to amend the linear mapping.
>> [...]
>>
>> # Threat model
>>
>> The proposed scheme aims at mitigating data-only attacks (e.g.
>> use-after-free/cross-cache attacks). In other words, it is assumed that
>> control flow is not corrupted, and that the attacker does not achieve
>> arbitrary code execution. Nothing prevents the pkey register from being
>> set to its most permissive state - the assumption is that the register
>> is only modified on legitimate code paths.
> Is the threat model that the attacker has already achieved full
> read/write access to unprotected kernel data and should be stopped
> from gaining write access to protected data? Or is the threat model
> that the attacker has achieved some limited corruption, and this
> series is intended to make it harder to either gain write access to
> protected data or achieve full read/write access to unprotected data?
The assumption is that the attacker has acquired a write primitive that
could potentially allow corrupting any kernel data. The objective is to
make it harder to exploit that primitive by making critical data immune
to it. Nothing stops the attacker to turn to another (unprotected)
target, but this is no different from hypervisor-based protection - the
hope is that removing the low-hanging fruits makes it too difficult to
build a complete exploit chain.
- Kevin
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 08/16] mm: Introduce kernel_pgtables_set_pkey()
2024-12-09 10:03 ` Peter Zijlstra
@ 2024-12-10 9:27 ` Kevin Brodsky
0 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-10 9:27 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
On 09/12/2024 11:03, Peter Zijlstra wrote:
> On Fri, Dec 06, 2024 at 10:11:02AM +0000, Kevin Brodsky wrote:
>> kernel_pgtables_set_pkey() allows setting the pkey of all page table
>> pages in swapper_pg_dir, recursively. This will be needed by
>> kpkeys_hardened_pgtables, as it relies on all PTPs being mapped with
>> a non-default pkey. Those initial kernel page tables cannot
>> practically be assigned a non-default pkey right when they are
>> allocated, so mutating them during (early) boot is required.
>>
>> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
>> ---
>>
>> It feels that some sort of locking is called for in
>> kernel_pgtables_set_pkey(), but I couldn't figure out what would be
>> appropriate.
> init_mm.page_table_lock is typically the one used to serialize kernel
> page tables IIRC.
That does seem to be the case, thanks! Hopefully holding that spinlock
for the entire duration of the loop in kernel_pgtables_set_pkey() won't
be an issue.
- Kevin
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 12/16] arm64: mm: Map p4d/pgd with privileged pkey
2024-12-09 10:24 ` Peter Zijlstra
@ 2024-12-10 9:27 ` Kevin Brodsky
2024-12-10 12:23 ` Peter Zijlstra
0 siblings, 1 reply; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-10 9:27 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
On 09/12/2024 11:24, Peter Zijlstra wrote:
> On Fri, Dec 06, 2024 at 10:11:06AM +0000, Kevin Brodsky wrote:
>> If CONFIG_KPKEYS_HARDENED_PGTABLES is enabled, map p4d/pgd pages
>> using a privileged pkey (KPKEYS_PKEY_PGTABLES), so that they can
>> only be written under guard(kpkeys_hardened_pgtables).
>>
>> The case where pgd is not page-sized is not currently handled -
>> this is pending support for pkeys in kmem_cache.
>>
>> This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled
>> (default).
> Should not this live in pagetable_*_[cd]tor() in generic code?
This would certainly be preferable but it doesn't look like such helpers
exist for p4d/pgd. For p4d, we could potentially handle this in the
generic __p4d_alloc(), but I'm not sure we can assume that
p4d_alloc_one() won't be called from somewhere else. pgd_alloc() is
entirely arch-specific so not much we can do there.
- Kevin
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 13/16] arm64: mm: Reset pkey in __tlb_remove_table()
2024-12-09 10:29 ` Peter Zijlstra
@ 2024-12-10 9:28 ` Kevin Brodsky
2024-12-10 12:27 ` Peter Zijlstra
0 siblings, 1 reply; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-10 9:28 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
On 09/12/2024 11:29, Peter Zijlstra wrote:
> On Fri, Dec 06, 2024 at 10:11:07AM +0000, Kevin Brodsky wrote:
>> [...]
>>
>> diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
>> index a947c6e784ed..d1611ffa6d91 100644
>> --- a/arch/arm64/include/asm/tlb.h
>> +++ b/arch/arm64/include/asm/tlb.h
>> @@ -10,10 +10,14 @@
>>
>> #include <linux/pagemap.h>
>> #include <linux/swap.h>
>> +#include <linux/kpkeys.h>
>>
>> static inline void __tlb_remove_table(void *_table)
>> {
>> - free_page_and_swap_cache((struct page *)_table);
>> + struct page *page = (struct page *)_table;
>> +
>> + kpkeys_unprotect_pgtable_memory((unsigned long)page_address(page), 1);
>> + free_page_and_swap_cache(page);
>> }
> Same as for the others, perhaps stick this in generic code instead of in
> the arch code?
This should be doable, with some refactoring. __tlb_remove_table() is
currently called from two functions in mm/mmu_gather.c, I suppose I
could create a wrapper there that calls
kpkeys_unprotect_pgtable_memory() and then __tlb_remove_table(). Like in
the p4d case I do however wonder how robust this is, as
__tlb_remove_table() could end up being called from other places.
- Kevin
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 12/16] arm64: mm: Map p4d/pgd with privileged pkey
2024-12-10 9:27 ` Kevin Brodsky
@ 2024-12-10 12:23 ` Peter Zijlstra
2024-12-11 13:35 ` Kevin Brodsky
0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2024-12-10 12:23 UTC (permalink / raw)
To: Kevin Brodsky
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
On Tue, Dec 10, 2024 at 10:27:56AM +0100, Kevin Brodsky wrote:
> On 09/12/2024 11:24, Peter Zijlstra wrote:
> > On Fri, Dec 06, 2024 at 10:11:06AM +0000, Kevin Brodsky wrote:
> >> If CONFIG_KPKEYS_HARDENED_PGTABLES is enabled, map p4d/pgd pages
> >> using a privileged pkey (KPKEYS_PKEY_PGTABLES), so that they can
> >> only be written under guard(kpkeys_hardened_pgtables).
> >>
> >> The case where pgd is not page-sized is not currently handled -
> >> this is pending support for pkeys in kmem_cache.
> >>
> >> This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled
> >> (default).
> > Should not this live in pagetable_*_[cd]tor() in generic code?
>
> This would certainly be preferable but it doesn't look like such helpers
> exist for p4d/pgd. For p4d, we could potentially handle this in the
> generic __p4d_alloc(), but I'm not sure we can assume that
> p4d_alloc_one() won't be called from somewhere else. pgd_alloc() is
> entirely arch-specific so not much we can do there.
Can't we add the missing pagetable_{p4d,pgd}_[cd]tor() functions. Yes,
it will mean touching a bunch of arch code, but it shouldn't be hard.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 13/16] arm64: mm: Reset pkey in __tlb_remove_table()
2024-12-10 9:28 ` Kevin Brodsky
@ 2024-12-10 12:27 ` Peter Zijlstra
2024-12-11 13:37 ` Kevin Brodsky
0 siblings, 1 reply; 29+ messages in thread
From: Peter Zijlstra @ 2024-12-10 12:27 UTC (permalink / raw)
To: Kevin Brodsky
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
On Tue, Dec 10, 2024 at 10:28:44AM +0100, Kevin Brodsky wrote:
> On 09/12/2024 11:29, Peter Zijlstra wrote:
> > On Fri, Dec 06, 2024 at 10:11:07AM +0000, Kevin Brodsky wrote:
> >> [...]
> >>
> >> diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
> >> index a947c6e784ed..d1611ffa6d91 100644
> >> --- a/arch/arm64/include/asm/tlb.h
> >> +++ b/arch/arm64/include/asm/tlb.h
> >> @@ -10,10 +10,14 @@
> >>
> >> #include <linux/pagemap.h>
> >> #include <linux/swap.h>
> >> +#include <linux/kpkeys.h>
> >>
> >> static inline void __tlb_remove_table(void *_table)
> >> {
> >> - free_page_and_swap_cache((struct page *)_table);
> >> + struct page *page = (struct page *)_table;
> >> +
> >> + kpkeys_unprotect_pgtable_memory((unsigned long)page_address(page), 1);
> >> + free_page_and_swap_cache(page);
> >> }
> > Same as for the others, perhaps stick this in generic code instead of in
> > the arch code?
>
> This should be doable, with some refactoring. __tlb_remove_table() is
> currently called from two functions in mm/mmu_gather.c, I suppose I
> could create a wrapper there that calls
> kpkeys_unprotect_pgtable_memory() and then __tlb_remove_table(). Like in
> the p4d case I do however wonder how robust this is, as
> __tlb_remove_table() could end up being called from other places.
I don't foresee other __tlb_remove_table() users, this is all rather
speicific code. But if there ever were to be new users, it is something
they would have to take into consideration.
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 12/16] arm64: mm: Map p4d/pgd with privileged pkey
2024-12-10 12:23 ` Peter Zijlstra
@ 2024-12-11 13:35 ` Kevin Brodsky
0 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-11 13:35 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
On 10/12/2024 13:23, Peter Zijlstra wrote:
> On Tue, Dec 10, 2024 at 10:27:56AM +0100, Kevin Brodsky wrote:
>> On 09/12/2024 11:24, Peter Zijlstra wrote:
>>> On Fri, Dec 06, 2024 at 10:11:06AM +0000, Kevin Brodsky wrote:
>>>> If CONFIG_KPKEYS_HARDENED_PGTABLES is enabled, map p4d/pgd pages
>>>> using a privileged pkey (KPKEYS_PKEY_PGTABLES), so that they can
>>>> only be written under guard(kpkeys_hardened_pgtables).
>>>>
>>>> The case where pgd is not page-sized is not currently handled -
>>>> this is pending support for pkeys in kmem_cache.
>>>>
>>>> This patch is a no-op if CONFIG_KPKEYS_HARDENED_PGTABLES is disabled
>>>> (default).
>>> Should not this live in pagetable_*_[cd]tor() in generic code?
>> This would certainly be preferable but it doesn't look like such helpers
>> exist for p4d/pgd. For p4d, we could potentially handle this in the
>> generic __p4d_alloc(), but I'm not sure we can assume that
>> p4d_alloc_one() won't be called from somewhere else. pgd_alloc() is
>> entirely arch-specific so not much we can do there.
> Can't we add the missing pagetable_{p4d,pgd}_[cd]tor() functions. Yes,
> it will mean touching a bunch of arch code, but it shouldn't be hard.
It does look doable. The p4d level shouldn't be an issue, it's unclear
why it doesn't follow the same pattern as pud already. pgd will be more
involved (no generic layer at all) but as you say it should just be
about some churn in arch code.
An extra complication is that the pgd level may be smaller than a page,
at least on arm64 (see pgd_alloc() in arch/arm64/mm/pgd.c). I suppose
affected architectures will have to define their own pgd_alloc_one().
I'll give it a try and post something separately if it looks sensible.
- Kevin
^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [RFC PATCH 13/16] arm64: mm: Reset pkey in __tlb_remove_table()
2024-12-10 12:27 ` Peter Zijlstra
@ 2024-12-11 13:37 ` Kevin Brodsky
0 siblings, 0 replies; 29+ messages in thread
From: Kevin Brodsky @ 2024-12-11 13:37 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-hardening, linux-kernel, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, jannh, jeffxu, joey.gouly, kees,
maz, pierre.langlois, qperret, ryan.roberts, will,
linux-arm-kernel, x86
On 10/12/2024 13:27, Peter Zijlstra wrote:
> On Tue, Dec 10, 2024 at 10:28:44AM +0100, Kevin Brodsky wrote:
>> On 09/12/2024 11:29, Peter Zijlstra wrote:
>>> On Fri, Dec 06, 2024 at 10:11:07AM +0000, Kevin Brodsky wrote:
>>>> [...]
>>>>
>>>> diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
>>>> index a947c6e784ed..d1611ffa6d91 100644
>>>> --- a/arch/arm64/include/asm/tlb.h
>>>> +++ b/arch/arm64/include/asm/tlb.h
>>>> @@ -10,10 +10,14 @@
>>>>
>>>> #include <linux/pagemap.h>
>>>> #include <linux/swap.h>
>>>> +#include <linux/kpkeys.h>
>>>>
>>>> static inline void __tlb_remove_table(void *_table)
>>>> {
>>>> - free_page_and_swap_cache((struct page *)_table);
>>>> + struct page *page = (struct page *)_table;
>>>> +
>>>> + kpkeys_unprotect_pgtable_memory((unsigned long)page_address(page), 1);
>>>> + free_page_and_swap_cache(page);
>>>> }
>>> Same as for the others, perhaps stick this in generic code instead of in
>>> the arch code?
>> This should be doable, with some refactoring. __tlb_remove_table() is
>> currently called from two functions in mm/mmu_gather.c, I suppose I
>> could create a wrapper there that calls
>> kpkeys_unprotect_pgtable_memory() and then __tlb_remove_table(). Like in
>> the p4d case I do however wonder how robust this is, as
>> __tlb_remove_table() could end up being called from other places.
> I don't foresee other __tlb_remove_table() users, this is all rather
> speicific code. But if there ever were to be new users, it is something
> they would have to take into consideration.
Fair enough, I'll handle that in mm/mmu_gather.c then.
- Kevin
^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2024-12-11 13:39 UTC | newest]
Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-06 10:10 [RFC PATCH 00/16] pkeys-based page table hardening Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 01/16] mm: Introduce kpkeys Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 02/16] set_memory: Introduce set_memory_pkey() stub Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 03/16] arm64: mm: Enable overlays for all EL1 indirect permissions Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 04/16] arm64: Introduce por_set_pkey_perms() helper Kevin Brodsky
2024-12-06 10:10 ` [RFC PATCH 05/16] arm64: Implement asm/kpkeys.h using POE Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 06/16] arm64: set_memory: Implement set_memory_pkey() Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 07/16] arm64: Enable kpkeys Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 08/16] mm: Introduce kernel_pgtables_set_pkey() Kevin Brodsky
2024-12-09 10:03 ` Peter Zijlstra
2024-12-10 9:27 ` Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 09/16] mm: Introduce kpkeys_hardened_pgtables Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 10/16] mm: Map page tables with privileged pkey Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 11/16] arm64: kpkeys: Support KPKEYS_LVL_PGTABLES Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 12/16] arm64: mm: Map p4d/pgd with privileged pkey Kevin Brodsky
2024-12-09 10:24 ` Peter Zijlstra
2024-12-10 9:27 ` Kevin Brodsky
2024-12-10 12:23 ` Peter Zijlstra
2024-12-11 13:35 ` Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 13/16] arm64: mm: Reset pkey in __tlb_remove_table() Kevin Brodsky
2024-12-09 10:29 ` Peter Zijlstra
2024-12-10 9:28 ` Kevin Brodsky
2024-12-10 12:27 ` Peter Zijlstra
2024-12-11 13:37 ` Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 14/16] arm64: mm: Guard page table writes with kpkeys Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 15/16] arm64: Enable kpkeys_hardened_pgtables support Kevin Brodsky
2024-12-06 10:11 ` [RFC PATCH 16/16] mm: Add basic tests for kpkeys_hardened_pgtables Kevin Brodsky
2024-12-06 19:14 ` [RFC PATCH 00/16] pkeys-based page table hardening Jann Horn
2024-12-09 12:57 ` Kevin Brodsky
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).