* [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery
@ 2024-10-29 14:45 Kevin Brodsky
2024-10-29 14:45 ` [PATCH v3 1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures Kevin Brodsky
` (7 more replies)
0 siblings, 8 replies; 17+ messages in thread
From: Kevin Brodsky @ 2024-10-29 14:45 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Kevin Brodsky, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, jeffxu,
joey.gouly, keith.lucas, pierre.langlois, shuah, sroettger, tglx,
will, yury.khrustalev, linux-kselftest, x86
This series is a follow-up to Joey's Permission Overlay Extension (POE)
series [1] that recently landed on mainline. The goal is to improve the
way we handle the register that governs which pkeys/POIndex are
accessible (POR_EL0) during signal delivery. As things stand, we may
unexpectedly fail to write the signal frame on the stack because POR_EL0
is not reset before the uaccess operations. See patch 1 for more details
and the main changes this series brings.
A similar series landed recently for x86/MPK [2]; the present series
aims at aligning arm64 with x86. Worth noting: once the signal frame is
written, POR_EL0 is still set to POR_EL0_INIT, granting access to pkey 0
only. This means that a program that sets up an alternate signal stack
with a non-zero pkey will need some assembly trampoline to set POR_EL0
before invoking the real signal handler, as discussed here [3]. This is
not ideal, but it makes experimentation with pkeys in signal handlers
possible while waiting for a potential interface to control the pkey
state when delivering a signal. See Pierre's reply [4] for more
information about use-cases and a potential interface.
The x86 series also added kselftests to ensure that no spurious SIGSEGV
occurs during signal delivery regardless of which pkey is accessible at
the point where the signal is delivered. This series adapts those
kselftests to allow running them on arm64 (patch 4-5). There is a
dependency on Yury's PKEY_UNRESTRICTED patch [7] for patch 4
specifically.
Finally patch 2 is a clean-up following feedback on Joey's series [5].
I have tested this series on arm64 and x86_64 (booting and running the
protection_keys and pkey_sighandler_tests mm kselftests).
- Kevin
---
v2..v3:
* Reordered patches (patch 1 is now the main patch).
* Patch 1: compute por_enable_all with an explicit loop, based on
arch_max_pkey() (suggestion from Dave M).
* Patch 4: improved naming, replaced global pkey reg value with inline
helper, made use of Yury's PKEY_UNRESTRICTED macro [7] (suggestions
from Dave H).
v2: https://lore.kernel.org/linux-arm-kernel/20241023150511.3923558-1-kevin.brodsky@arm.com/
v1..v2:
* In setup_rt_frame(), ensured that POR_EL0 is reset to its original
value if we fail to deliver the signal (addresses Catalin's concern [6]).
* Renamed *unpriv_access* to *user_access* in patch 3 (suggestion from
Dave).
* Made what patch 1-2 do explicit in the commit message body (suggestion
from Dave).
v1: https://lore.kernel.org/linux-arm-kernel/20241017133909.3837547-1-kevin.brodsky@arm.com/
[1] https://lore.kernel.org/linux-arm-kernel/20240822151113.1479789-1-joey.gouly@arm.com/
[2] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
[3] https://lore.kernel.org/lkml/CABi2SkWxNkP2O7ipkP67WKz0-LV33e5brReevTTtba6oKUfHRw@mail.gmail.com/
[4] https://lore.kernel.org/linux-arm-kernel/87plns8owh.fsf@arm.com/
[5] https://lore.kernel.org/linux-arm-kernel/20241015114116.GA19334@willie-the-truck/
[6] https://lore.kernel.org/linux-arm-kernel/Zw6D2waVyIwYE7wd@arm.com/
[7] https://lore.kernel.org/all/20241028090715.509527-2-yury.khrustalev@arm.com/
Cc: akpm@linux-foundation.org
Cc: anshuman.khandual@arm.com
Cc: aruna.ramakrishna@oracle.com
Cc: broonie@kernel.org
Cc: catalin.marinas@arm.com
Cc: dave.hansen@linux.intel.com
Cc: Dave.Martin@arm.com
Cc: jeffxu@chromium.org
Cc: joey.gouly@arm.com
Cc: keith.lucas@oracle.com
Cc: pierre.langlois@arm.com
Cc: shuah@kernel.org
Cc: sroettger@google.com
Cc: tglx@linutronix.de
Cc: will@kernel.org
Cc: yury.khrustalev@arm.com
Cc: linux-kselftest@vger.kernel.org
Cc: x86@kernel.org
Kevin Brodsky (5):
arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
arm64: signal: Remove unnecessary check when saving POE state
arm64: signal: Remove unused macro
selftests/mm: Use generic pkey register manipulation
selftests/mm: Enable pkey_sighandler_tests on arm64
arch/arm64/kernel/signal.c | 95 ++++++++++++---
tools/testing/selftests/mm/Makefile | 8 +-
tools/testing/selftests/mm/pkey-arm64.h | 1 +
tools/testing/selftests/mm/pkey-x86.h | 2 +
.../selftests/mm/pkey_sighandler_tests.c | 115 ++++++++++++++----
5 files changed, 176 insertions(+), 45 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v3 1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
2024-10-29 14:45 [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Kevin Brodsky
@ 2024-10-29 14:45 ` Kevin Brodsky
2024-10-30 22:01 ` Jeff Xu
2024-10-29 14:45 ` [PATCH v3 2/5] arm64: signal: Remove unnecessary check when saving POE state Kevin Brodsky
` (6 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Kevin Brodsky @ 2024-10-29 14:45 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Kevin Brodsky, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, jeffxu,
joey.gouly, keith.lucas, pierre.langlois, shuah, sroettger, tglx,
will, yury.khrustalev, linux-kselftest, x86
TL;DR: reset POR_EL0 to "allow all" before writing the signal frame,
preventing spurious uaccess failures.
When POE is supported, the POR_EL0 register constrains memory
accesses based on the target page's POIndex (pkey). This raises the
question: what constraints should apply to a signal handler? The
current answer is that POR_EL0 is reset to POR_EL0_INIT when
invoking the handler, giving it full access to POIndex 0. This is in
line with x86's MPK support and remains unchanged.
This is only part of the story, though. POR_EL0 constrains all
unprivileged memory accesses, meaning that uaccess routines such as
put_user() are also impacted. As a result POR_EL0 may prevent the
signal frame from being written to the signal stack (ultimately
causing a SIGSEGV). This is especially concerning when an alternate
signal stack is used, because userspace may want to prevent access
to it outside of signal handlers. There is currently no provision
for that: POR_EL0 is reset after writing to the stack, and
POR_EL0_INIT only enables access to POIndex 0.
This patch ensures that POR_EL0 is reset to its most permissive
state before the signal stack is accessed. Once the signal frame has
been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
to the signal handler to enable access to additional pkeys if
needed. As to sigreturn(), it expects having access to the stack
like any other syscall; we only need to ensure that POR_EL0 is
restored from the signal frame after all uaccess calls. This
approach is in line with the recent x86/pkeys series [1].
Resetting POR_EL0 early introduces some complications, in that we
can no longer read the register directly in preserve_poe_context().
This is addressed by introducing a struct (user_access_state)
and helpers to manage any such register impacting user accesses
(uaccess and accesses in userspace). Things look like this on signal
delivery:
1. Save original POR_EL0 into struct [save_reset_user_access_state()]
2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
3. Create signal frame
4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
5. Finalise signal frame
6. If all operations succeeded:
a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
b. Else reset POR_EL0 to its original value [restore_user_access_state()]
If any step fails when setting up the signal frame, the process will
be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
that the original POR_EL0 is saved in the signal frame when
delivering that SIGSEGV (so that the original value is restored by
sigreturn).
The return path (sys_rt_sigreturn) doesn't strictly require any change
since restore_poe_context() is already called last. However, to
avoid uaccess calls being accidentally added after that point, we
use the same approach as in the delivery path, i.e. separating
uaccess from writing to the register:
1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
2. Set POR_EL0 to the saved value [restore_user_access_state()]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/kernel/signal.c | 92 ++++++++++++++++++++++++++++++++------
1 file changed, 78 insertions(+), 14 deletions(-)
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 561986947530..c7d311d8b92a 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -19,6 +19,7 @@
#include <linux/ratelimit.h>
#include <linux/rseq.h>
#include <linux/syscalls.h>
+#include <linux/pkeys.h>
#include <asm/daifflags.h>
#include <asm/debug-monitors.h>
@@ -66,10 +67,63 @@ struct rt_sigframe_user_layout {
unsigned long end_offset;
};
+/*
+ * Holds any EL0-controlled state that influences unprivileged memory accesses.
+ * This includes both accesses done in userspace and uaccess done in the kernel.
+ *
+ * This state needs to be carefully managed to ensure that it doesn't cause
+ * uaccess to fail when setting up the signal frame, and the signal handler
+ * itself also expects a well-defined state when entered.
+ */
+struct user_access_state {
+ u64 por_el0;
+};
+
#define BASE_SIGFRAME_SIZE round_up(sizeof(struct rt_sigframe), 16)
#define TERMINATOR_SIZE round_up(sizeof(struct _aarch64_ctx), 16)
#define EXTRA_CONTEXT_SIZE round_up(sizeof(struct extra_context), 16)
+/*
+ * Save the user access state into ua_state and reset it to disable any
+ * restrictions.
+ */
+static void save_reset_user_access_state(struct user_access_state *ua_state)
+{
+ if (system_supports_poe()) {
+ u64 por_enable_all = 0;
+
+ for (int pkey = 0; pkey < arch_max_pkey(); pkey++)
+ por_enable_all |= POE_RXW << (pkey * POR_BITS_PER_PKEY);
+
+ ua_state->por_el0 = read_sysreg_s(SYS_POR_EL0);
+ write_sysreg_s(por_enable_all, SYS_POR_EL0);
+ /* Ensure that any subsequent uaccess observes the updated value */
+ isb();
+ }
+}
+
+/*
+ * Set the user access state for invoking the signal handler.
+ *
+ * No uaccess should be done after that function is called.
+ */
+static void set_handler_user_access_state(void)
+{
+ if (system_supports_poe())
+ write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
+}
+
+/*
+ * Restore the user access state to the values saved in ua_state.
+ *
+ * No uaccess should be done after that function is called.
+ */
+static void restore_user_access_state(const struct user_access_state *ua_state)
+{
+ if (system_supports_poe())
+ write_sysreg_s(ua_state->por_el0, SYS_POR_EL0);
+}
+
static void init_user_layout(struct rt_sigframe_user_layout *user)
{
const size_t reserved_size =
@@ -261,18 +315,20 @@ static int restore_fpmr_context(struct user_ctxs *user)
return err;
}
-static int preserve_poe_context(struct poe_context __user *ctx)
+static int preserve_poe_context(struct poe_context __user *ctx,
+ const struct user_access_state *ua_state)
{
int err = 0;
__put_user_error(POE_MAGIC, &ctx->head.magic, err);
__put_user_error(sizeof(*ctx), &ctx->head.size, err);
- __put_user_error(read_sysreg_s(SYS_POR_EL0), &ctx->por_el0, err);
+ __put_user_error(ua_state->por_el0, &ctx->por_el0, err);
return err;
}
-static int restore_poe_context(struct user_ctxs *user)
+static int restore_poe_context(struct user_ctxs *user,
+ struct user_access_state *ua_state)
{
u64 por_el0;
int err = 0;
@@ -282,7 +338,7 @@ static int restore_poe_context(struct user_ctxs *user)
__get_user_error(por_el0, &(user->poe->por_el0), err);
if (!err)
- write_sysreg_s(por_el0, SYS_POR_EL0);
+ ua_state->por_el0 = por_el0;
return err;
}
@@ -850,7 +906,8 @@ static int parse_user_sigframe(struct user_ctxs *user,
}
static int restore_sigframe(struct pt_regs *regs,
- struct rt_sigframe __user *sf)
+ struct rt_sigframe __user *sf,
+ struct user_access_state *ua_state)
{
sigset_t set;
int i, err;
@@ -899,7 +956,7 @@ static int restore_sigframe(struct pt_regs *regs,
err = restore_zt_context(&user);
if (err == 0 && system_supports_poe() && user.poe)
- err = restore_poe_context(&user);
+ err = restore_poe_context(&user, ua_state);
return err;
}
@@ -908,6 +965,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
{
struct pt_regs *regs = current_pt_regs();
struct rt_sigframe __user *frame;
+ struct user_access_state ua_state;
/* Always make any pending restarted system calls return -EINTR */
current->restart_block.fn = do_no_restart_syscall;
@@ -924,12 +982,14 @@ SYSCALL_DEFINE0(rt_sigreturn)
if (!access_ok(frame, sizeof (*frame)))
goto badframe;
- if (restore_sigframe(regs, frame))
+ if (restore_sigframe(regs, frame, &ua_state))
goto badframe;
if (restore_altstack(&frame->uc.uc_stack))
goto badframe;
+ restore_user_access_state(&ua_state);
+
return regs->regs[0];
badframe:
@@ -1035,7 +1095,8 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
}
static int setup_sigframe(struct rt_sigframe_user_layout *user,
- struct pt_regs *regs, sigset_t *set)
+ struct pt_regs *regs, sigset_t *set,
+ const struct user_access_state *ua_state)
{
int i, err = 0;
struct rt_sigframe __user *sf = user->sigframe;
@@ -1097,10 +1158,9 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
struct poe_context __user *poe_ctx =
apply_user_offset(user, user->poe_offset);
- err |= preserve_poe_context(poe_ctx);
+ err |= preserve_poe_context(poe_ctx, ua_state);
}
-
/* ZA state if present */
if (system_supports_sme() && err == 0 && user->za_offset) {
struct za_context __user *za_ctx =
@@ -1237,9 +1297,6 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
sme_smstop();
}
- if (system_supports_poe())
- write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
-
if (ka->sa.sa_flags & SA_RESTORER)
sigtramp = ka->sa.sa_restorer;
else
@@ -1253,6 +1310,7 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
{
struct rt_sigframe_user_layout user;
struct rt_sigframe __user *frame;
+ struct user_access_state ua_state;
int err = 0;
fpsimd_signal_preserve_current_state();
@@ -1260,13 +1318,14 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
if (get_sigframe(&user, ksig, regs))
return 1;
+ save_reset_user_access_state(&ua_state);
frame = user.sigframe;
__put_user_error(0, &frame->uc.uc_flags, err);
__put_user_error(NULL, &frame->uc.uc_link, err);
err |= __save_altstack(&frame->uc.uc_stack, regs->sp);
- err |= setup_sigframe(&user, regs, set);
+ err |= setup_sigframe(&user, regs, set, &ua_state);
if (err == 0) {
setup_return(regs, &ksig->ka, &user, usig);
if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
@@ -1276,6 +1335,11 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
}
}
+ if (err == 0)
+ set_handler_user_access_state();
+ else
+ restore_user_access_state(&ua_state);
+
return err;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 2/5] arm64: signal: Remove unnecessary check when saving POE state
2024-10-29 14:45 [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Kevin Brodsky
2024-10-29 14:45 ` [PATCH v3 1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures Kevin Brodsky
@ 2024-10-29 14:45 ` Kevin Brodsky
2024-10-29 14:45 ` [PATCH v3 3/5] arm64: signal: Remove unused macro Kevin Brodsky
` (5 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Kevin Brodsky @ 2024-10-29 14:45 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Kevin Brodsky, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, jeffxu,
joey.gouly, keith.lucas, pierre.langlois, shuah, sroettger, tglx,
will, yury.khrustalev, linux-kselftest, x86
The POE frame record is allocated unconditionally if POE is
supported. If the allocation fails, a SIGSEGV is delivered before
setup_sigframe() can be reached. As a result there is no need to
consider poe_offset before saving POR_EL0; just remove that check.
This is in line with other frame records (FPMR, TPIDR2).
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/kernel/signal.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index c7d311d8b92a..d5eb517cc4df 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -1154,7 +1154,7 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
err |= preserve_fpmr_context(fpmr_ctx);
}
- if (system_supports_poe() && err == 0 && user->poe_offset) {
+ if (system_supports_poe() && err == 0) {
struct poe_context __user *poe_ctx =
apply_user_offset(user, user->poe_offset);
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 3/5] arm64: signal: Remove unused macro
2024-10-29 14:45 [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Kevin Brodsky
2024-10-29 14:45 ` [PATCH v3 1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures Kevin Brodsky
2024-10-29 14:45 ` [PATCH v3 2/5] arm64: signal: Remove unnecessary check when saving POE state Kevin Brodsky
@ 2024-10-29 14:45 ` Kevin Brodsky
2024-10-29 14:45 ` [PATCH v3 4/5] selftests/mm: Use generic pkey register manipulation Kevin Brodsky
` (4 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Kevin Brodsky @ 2024-10-29 14:45 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Kevin Brodsky, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, jeffxu,
joey.gouly, keith.lucas, pierre.langlois, shuah, sroettger, tglx,
will, yury.khrustalev, linux-kselftest, x86
Commit 33f082614c34 ("arm64: signal: Allow expansion of the signal
frame") introduced the BASE_SIGFRAME_SIZE macro but it has
apparently never been used; just remove it.
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
arch/arm64/kernel/signal.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index d5eb517cc4df..b077181a66c0 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -79,7 +79,6 @@ struct user_access_state {
u64 por_el0;
};
-#define BASE_SIGFRAME_SIZE round_up(sizeof(struct rt_sigframe), 16)
#define TERMINATOR_SIZE round_up(sizeof(struct _aarch64_ctx), 16)
#define EXTRA_CONTEXT_SIZE round_up(sizeof(struct extra_context), 16)
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 4/5] selftests/mm: Use generic pkey register manipulation
2024-10-29 14:45 [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Kevin Brodsky
` (2 preceding siblings ...)
2024-10-29 14:45 ` [PATCH v3 3/5] arm64: signal: Remove unused macro Kevin Brodsky
@ 2024-10-29 14:45 ` Kevin Brodsky
2024-10-29 17:42 ` Dave Hansen
2024-10-29 14:45 ` [PATCH v3 5/5] selftests/mm: Enable pkey_sighandler_tests on arm64 Kevin Brodsky
` (3 subsequent siblings)
7 siblings, 1 reply; 17+ messages in thread
From: Kevin Brodsky @ 2024-10-29 14:45 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Kevin Brodsky, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, jeffxu,
joey.gouly, keith.lucas, pierre.langlois, shuah, sroettger, tglx,
will, yury.khrustalev, linux-kselftest, x86
pkey_sighandler_tests.c currently hardcodes x86 PKRU encodings. The
first step towards running those tests on arm64 is to abstract away
the pkey register values.
Since those tests want to deny access to all keys except a few,
we have each arch define PKEY_REG_ALLOW_NONE, the pkey register value
denying access to all keys. We then use the existing set_pkey_bits()
helper to grant access to specific keys.
Because pkeys may also remove the execute permission on arm64, we
need to be a little careful: all code is mapped with pkey 0, and we
need it to remain executable. pkey_reg_restrictive_default() is
introduced for that purpose: the value it returns prevents RW access
to all pkeys, but retains X permission for pkey 0.
test_pkru_preserved_after_sigusr1() only checks that the pkey
register value remains unchanged after a signal is delivered, so the
particular value is irrelevant. We enable pkey 0 and a few more
arbitrary keys in the smallest range available on all architectures
(8 keys on arm64).
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
tools/testing/selftests/mm/pkey-arm64.h | 1 +
tools/testing/selftests/mm/pkey-x86.h | 2 +
.../selftests/mm/pkey_sighandler_tests.c | 53 +++++++++++++++----
3 files changed, 47 insertions(+), 9 deletions(-)
diff --git a/tools/testing/selftests/mm/pkey-arm64.h b/tools/testing/selftests/mm/pkey-arm64.h
index 580e1b0bb38e..d57fbeace38f 100644
--- a/tools/testing/selftests/mm/pkey-arm64.h
+++ b/tools/testing/selftests/mm/pkey-arm64.h
@@ -31,6 +31,7 @@
#define NR_RESERVED_PKEYS 1 /* pkey-0 */
#define PKEY_ALLOW_ALL 0x77777777
+#define PKEY_REG_ALLOW_NONE 0x0
#define PKEY_BITS_PER_PKEY 4
#define PAGE_SIZE sysconf(_SC_PAGESIZE)
diff --git a/tools/testing/selftests/mm/pkey-x86.h b/tools/testing/selftests/mm/pkey-x86.h
index 5f28e26a2511..ac91777c8917 100644
--- a/tools/testing/selftests/mm/pkey-x86.h
+++ b/tools/testing/selftests/mm/pkey-x86.h
@@ -34,6 +34,8 @@
#define PAGE_SIZE 4096
#define MB (1<<20)
+#define PKEY_REG_ALLOW_NONE 0x55555555
+
static inline void __page_o_noops(void)
{
/* 8-bytes of instruction * 512 bytes = 1 page */
diff --git a/tools/testing/selftests/mm/pkey_sighandler_tests.c b/tools/testing/selftests/mm/pkey_sighandler_tests.c
index a8088b645ad6..501880dbdc37 100644
--- a/tools/testing/selftests/mm/pkey_sighandler_tests.c
+++ b/tools/testing/selftests/mm/pkey_sighandler_tests.c
@@ -11,6 +11,7 @@
*/
#define _GNU_SOURCE
#define __SANE_USERSPACE_TYPES__
+#include <linux/mman.h>
#include <errno.h>
#include <sys/syscall.h>
#include <string.h>
@@ -65,6 +66,20 @@ long syscall_raw(long n, long a1, long a2, long a3, long a4, long a5, long a6)
return ret;
}
+/*
+ * Returns the most restrictive pkey register value that can be used by the
+ * tests.
+ */
+static inline u64 pkey_reg_restrictive_default(void)
+{
+ /*
+ * Disallow everything except execution on pkey 0, so that each caller
+ * doesn't need to enable it explicitly (the selftest code runs with
+ * its code mapped with pkey 0).
+ */
+ return set_pkey_bits(PKEY_REG_ALLOW_NONE, 0, PKEY_DISABLE_ACCESS);
+}
+
static void sigsegv_handler(int signo, siginfo_t *info, void *ucontext)
{
pthread_mutex_lock(&mutex);
@@ -113,7 +128,7 @@ static void raise_sigusr2(void)
static void *thread_segv_with_pkey0_disabled(void *ptr)
{
/* Disable MPK 0 (and all others too) */
- __write_pkey_reg(0x55555555);
+ __write_pkey_reg(pkey_reg_restrictive_default());
/* Segfault (with SEGV_MAPERR) */
*(int *) (0x1) = 1;
@@ -123,7 +138,7 @@ static void *thread_segv_with_pkey0_disabled(void *ptr)
static void *thread_segv_pkuerr_stack(void *ptr)
{
/* Disable MPK 0 (and all others too) */
- __write_pkey_reg(0x55555555);
+ __write_pkey_reg(pkey_reg_restrictive_default());
/* After we disable MPK 0, we can't access the stack to return */
return NULL;
@@ -133,6 +148,7 @@ static void *thread_segv_maperr_ptr(void *ptr)
{
stack_t *stack = ptr;
int *bad = (int *)1;
+ u64 pkey_reg;
/*
* Setup alternate signal stack, which should be pkey_mprotect()ed by
@@ -142,7 +158,9 @@ static void *thread_segv_maperr_ptr(void *ptr)
syscall_raw(SYS_sigaltstack, (long)stack, 0, 0, 0, 0, 0);
/* Disable MPK 0. Only MPK 1 is enabled. */
- __write_pkey_reg(0x55555551);
+ pkey_reg = pkey_reg_restrictive_default();
+ pkey_reg = set_pkey_bits(pkey_reg, 1, PKEY_UNRESTRICTED);
+ __write_pkey_reg(pkey_reg);
/* Segfault */
*bad = 1;
@@ -240,6 +258,7 @@ static void test_sigsegv_handler_with_different_pkey_for_stack(void)
int pkey;
int parent_pid = 0;
int child_pid = 0;
+ u64 pkey_reg;
sa.sa_flags = SA_SIGINFO | SA_ONSTACK;
@@ -257,7 +276,10 @@ static void test_sigsegv_handler_with_different_pkey_for_stack(void)
assert(stack != MAP_FAILED);
/* Allow access to MPK 0 and MPK 1 */
- __write_pkey_reg(0x55555550);
+ pkey_reg = pkey_reg_restrictive_default();
+ pkey_reg = set_pkey_bits(pkey_reg, 0, PKEY_UNRESTRICTED);
+ pkey_reg = set_pkey_bits(pkey_reg, 1, PKEY_UNRESTRICTED);
+ __write_pkey_reg(pkey_reg);
/* Protect the new stack with MPK 1 */
pkey = pkey_alloc(0, 0);
@@ -307,7 +329,13 @@ static void test_sigsegv_handler_with_different_pkey_for_stack(void)
static void test_pkru_preserved_after_sigusr1(void)
{
struct sigaction sa;
- unsigned long pkru = 0x45454544;
+ u64 pkey_reg;
+
+ /* Allow access to MPK 0 and an arbitrary set of keys */
+ pkey_reg = pkey_reg_restrictive_default();
+ pkey_reg = set_pkey_bits(pkey_reg, 0, PKEY_UNRESTRICTED);
+ pkey_reg = set_pkey_bits(pkey_reg, 3, PKEY_UNRESTRICTED);
+ pkey_reg = set_pkey_bits(pkey_reg, 7, PKEY_UNRESTRICTED);
sa.sa_flags = SA_SIGINFO;
@@ -320,7 +348,7 @@ static void test_pkru_preserved_after_sigusr1(void)
memset(&siginfo, 0, sizeof(siginfo));
- __write_pkey_reg(pkru);
+ __write_pkey_reg(pkey_reg);
raise(SIGUSR1);
@@ -330,7 +358,7 @@ static void test_pkru_preserved_after_sigusr1(void)
pthread_mutex_unlock(&mutex);
/* Ensure the pkru value is the same after returning from signal. */
- ksft_test_result(pkru == __read_pkey_reg() &&
+ ksft_test_result(pkey_reg == __read_pkey_reg() &&
siginfo.si_signo == SIGUSR1,
"%s\n", __func__);
}
@@ -347,6 +375,7 @@ static noinline void *thread_sigusr2_self(void *ptr)
'S', 'I', 'G', 'U', 'S', 'R', '2',
'.', '.', '.', '\n', '\0'};
stack_t *stack = ptr;
+ u64 pkey_reg;
/*
* Setup alternate signal stack, which should be pkey_mprotect()ed by
@@ -356,7 +385,9 @@ static noinline void *thread_sigusr2_self(void *ptr)
syscall(SYS_sigaltstack, (long)stack, 0, 0, 0, 0, 0);
/* Disable MPK 0. Only MPK 2 is enabled. */
- __write_pkey_reg(0x55555545);
+ pkey_reg = pkey_reg_restrictive_default();
+ pkey_reg = set_pkey_bits(pkey_reg, 2, PKEY_UNRESTRICTED);
+ __write_pkey_reg(pkey_reg);
raise_sigusr2();
@@ -384,6 +415,7 @@ static void test_pkru_sigreturn(void)
int pkey;
int parent_pid = 0;
int child_pid = 0;
+ u64 pkey_reg;
sa.sa_handler = SIG_DFL;
sa.sa_flags = 0;
@@ -418,7 +450,10 @@ static void test_pkru_sigreturn(void)
* the current thread's stack is protected by the default MPK 0. Hence
* both need to be enabled.
*/
- __write_pkey_reg(0x55555544);
+ pkey_reg = pkey_reg_restrictive_default();
+ pkey_reg = set_pkey_bits(pkey_reg, 0, PKEY_UNRESTRICTED);
+ pkey_reg = set_pkey_bits(pkey_reg, 2, PKEY_UNRESTRICTED);
+ __write_pkey_reg(pkey_reg);
/* Protect the stack with MPK 2 */
pkey = pkey_alloc(0, 0);
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v3 5/5] selftests/mm: Enable pkey_sighandler_tests on arm64
2024-10-29 14:45 [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Kevin Brodsky
` (3 preceding siblings ...)
2024-10-29 14:45 ` [PATCH v3 4/5] selftests/mm: Use generic pkey register manipulation Kevin Brodsky
@ 2024-10-29 14:45 ` Kevin Brodsky
2024-10-29 18:28 ` [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Will Deacon
` (2 subsequent siblings)
7 siblings, 0 replies; 17+ messages in thread
From: Kevin Brodsky @ 2024-10-29 14:45 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Kevin Brodsky, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, jeffxu,
joey.gouly, keith.lucas, pierre.langlois, shuah, sroettger, tglx,
will, yury.khrustalev, linux-kselftest, x86
pkey_sighandler_tests.c makes raw syscalls using its own helper,
syscall_raw(). One of those syscalls is clone, which is problematic
as every architecture has a different opinion on the order of its
arguments.
To complete arm64 support, we therefore add an appropriate
implementation in syscall_raw(), and introduce a clone_raw() helper
that shuffles arguments as needed for each arch.
Having done this, we enable building pkey_sighandler_tests for arm64
in the Makefile.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
tools/testing/selftests/mm/Makefile | 8 +--
.../selftests/mm/pkey_sighandler_tests.c | 62 ++++++++++++++-----
2 files changed, 50 insertions(+), 20 deletions(-)
diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile
index 02e1204971b0..0f8c110e0805 100644
--- a/tools/testing/selftests/mm/Makefile
+++ b/tools/testing/selftests/mm/Makefile
@@ -105,12 +105,12 @@ endif
ifeq ($(CAN_BUILD_X86_64),1)
TEST_GEN_FILES += $(BINARIES_64)
endif
-else
-ifneq (,$(filter $(ARCH),arm64 powerpc))
+else ifeq ($(ARCH),arm64)
+TEST_GEN_FILES += protection_keys
+TEST_GEN_FILES += pkey_sighandler_tests
+else ifeq ($(ARCH),powerpc)
TEST_GEN_FILES += protection_keys
-endif
-
endif
ifneq (,$(filter $(ARCH),arm64 mips64 parisc64 powerpc riscv64 s390x sparc64 x86_64 s390))
diff --git a/tools/testing/selftests/mm/pkey_sighandler_tests.c b/tools/testing/selftests/mm/pkey_sighandler_tests.c
index 501880dbdc37..c593a426341c 100644
--- a/tools/testing/selftests/mm/pkey_sighandler_tests.c
+++ b/tools/testing/selftests/mm/pkey_sighandler_tests.c
@@ -60,12 +60,44 @@ long syscall_raw(long n, long a1, long a2, long a3, long a4, long a5, long a6)
: "=a"(ret)
: "a"(n), "b"(a1), "c"(a2), "d"(a3), "S"(a4), "D"(a5)
: "memory");
+#elif defined __aarch64__
+ register long x0 asm("x0") = a1;
+ register long x1 asm("x1") = a2;
+ register long x2 asm("x2") = a3;
+ register long x3 asm("x3") = a4;
+ register long x4 asm("x4") = a5;
+ register long x5 asm("x5") = a6;
+ register long x8 asm("x8") = n;
+ asm volatile ("svc #0"
+ : "=r"(x0)
+ : "r"(x0), "r"(x1), "r"(x2), "r"(x3), "r"(x4), "r"(x5), "r"(x8)
+ : "memory");
+ ret = x0;
#else
# error syscall_raw() not implemented
#endif
return ret;
}
+static inline long clone_raw(unsigned long flags, void *stack,
+ int *parent_tid, int *child_tid)
+{
+ long a1 = flags;
+ long a2 = (long)stack;
+ long a3 = (long)parent_tid;
+#if defined(__x86_64__) || defined(__i386)
+ long a4 = (long)child_tid;
+ long a5 = 0;
+#elif defined(__aarch64__)
+ long a4 = 0;
+ long a5 = (long)child_tid;
+#else
+# error clone_raw() not implemented
+#endif
+
+ return syscall_raw(SYS_clone, a1, a2, a3, a4, a5, 0);
+}
+
/*
* Returns the most restrictive pkey register value that can be used by the
* tests.
@@ -294,14 +326,13 @@ static void test_sigsegv_handler_with_different_pkey_for_stack(void)
memset(&siginfo, 0, sizeof(siginfo));
/* Use clone to avoid newer glibcs using rseq on new threads */
- long ret = syscall_raw(SYS_clone,
- CLONE_VM | CLONE_FS | CLONE_FILES |
- CLONE_SIGHAND | CLONE_THREAD | CLONE_SYSVSEM |
- CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID |
- CLONE_DETACHED,
- (long) ((char *)(stack) + STACK_SIZE),
- (long) &parent_pid,
- (long) &child_pid, 0, 0);
+ long ret = clone_raw(CLONE_VM | CLONE_FS | CLONE_FILES |
+ CLONE_SIGHAND | CLONE_THREAD | CLONE_SYSVSEM |
+ CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID |
+ CLONE_DETACHED,
+ stack + STACK_SIZE,
+ &parent_pid,
+ &child_pid);
if (ret < 0) {
errno = -ret;
@@ -466,14 +497,13 @@ static void test_pkru_sigreturn(void)
sigstack.ss_size = STACK_SIZE;
/* Use clone to avoid newer glibcs using rseq on new threads */
- long ret = syscall_raw(SYS_clone,
- CLONE_VM | CLONE_FS | CLONE_FILES |
- CLONE_SIGHAND | CLONE_THREAD | CLONE_SYSVSEM |
- CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID |
- CLONE_DETACHED,
- (long) ((char *)(stack) + STACK_SIZE),
- (long) &parent_pid,
- (long) &child_pid, 0, 0);
+ long ret = clone_raw(CLONE_VM | CLONE_FS | CLONE_FILES |
+ CLONE_SIGHAND | CLONE_THREAD | CLONE_SYSVSEM |
+ CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID |
+ CLONE_DETACHED,
+ stack + STACK_SIZE,
+ &parent_pid,
+ &child_pid);
if (ret < 0) {
errno = -ret;
--
2.43.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v3 4/5] selftests/mm: Use generic pkey register manipulation
2024-10-29 14:45 ` [PATCH v3 4/5] selftests/mm: Use generic pkey register manipulation Kevin Brodsky
@ 2024-10-29 17:42 ` Dave Hansen
0 siblings, 0 replies; 17+ messages in thread
From: Dave Hansen @ 2024-10-29 17:42 UTC (permalink / raw)
To: Kevin Brodsky, linux-arm-kernel
Cc: akpm, anshuman.khandual, aruna.ramakrishna, broonie,
catalin.marinas, dave.hansen, Dave.Martin, jeffxu, joey.gouly,
keith.lucas, pierre.langlois, shuah, sroettger, tglx, will,
yury.khrustalev, linux-kselftest, x86
The test changes look good to me:
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery
2024-10-29 14:45 [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Kevin Brodsky
` (4 preceding siblings ...)
2024-10-29 14:45 ` [PATCH v3 5/5] selftests/mm: Enable pkey_sighandler_tests on arm64 Kevin Brodsky
@ 2024-10-29 18:28 ` Will Deacon
2024-10-30 21:59 ` Jeff Xu
2024-11-04 17:12 ` (subset) " Catalin Marinas
7 siblings, 0 replies; 17+ messages in thread
From: Will Deacon @ 2024-10-29 18:28 UTC (permalink / raw)
To: linux-arm-kernel, Kevin Brodsky
Cc: catalin.marinas, kernel-team, Will Deacon, akpm,
anshuman.khandual, aruna.ramakrishna, broonie, dave.hansen,
Dave.Martin, jeffxu, joey.gouly, keith.lucas, pierre.langlois,
shuah, sroettger, tglx, yury.khrustalev, linux-kselftest, x86
On Tue, 29 Oct 2024 14:45:34 +0000, Kevin Brodsky wrote:
> This series is a follow-up to Joey's Permission Overlay Extension (POE)
> series [1] that recently landed on mainline. The goal is to improve the
> way we handle the register that governs which pkeys/POIndex are
> accessible (POR_EL0) during signal delivery. As things stand, we may
> unexpectedly fail to write the signal frame on the stack because POR_EL0
> is not reset before the uaccess operations. See patch 1 for more details
> and the main changes this series brings.
>
> [...]
Applied first patch to arm64 (for-next/fixes), thanks!
[1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
https://git.kernel.org/arm64/c/2e8a1acea859
Cheers,
--
Will
https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery
2024-10-29 14:45 [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Kevin Brodsky
` (5 preceding siblings ...)
2024-10-29 18:28 ` [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Will Deacon
@ 2024-10-30 21:59 ` Jeff Xu
2024-10-31 13:00 ` Kevin Brodsky
2024-11-04 17:12 ` (subset) " Catalin Marinas
7 siblings, 1 reply; 17+ messages in thread
From: Jeff Xu @ 2024-10-30 21:59 UTC (permalink / raw)
To: Kevin Brodsky
Cc: linux-arm-kernel, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, joey.gouly,
keith.lucas, pierre.langlois, shuah, sroettger, tglx, will,
yury.khrustalev, linux-kselftest, x86, Kees Cook,
Jorge Lucangeli Obes, Jann Horn
+Kees and Jorge and Jann
On Tue, Oct 29, 2024 at 7:45 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
>
> This series is a follow-up to Joey's Permission Overlay Extension (POE)
> series [1] that recently landed on mainline. The goal is to improve the
> way we handle the register that governs which pkeys/POIndex are
> accessible (POR_EL0) during signal delivery. As things stand, we may
> unexpectedly fail to write the signal frame on the stack because POR_EL0
> is not reset before the uaccess operations. See patch 1 for more details
> and the main changes this series brings.
>
> A similar series landed recently for x86/MPK [2]; the present series
> aims at aligning arm64 with x86. Worth noting: once the signal frame is
> written, POR_EL0 is still set to POR_EL0_INIT, granting access to pkey 0
> only. This means that a program that sets up an alternate signal stack
> with a non-zero pkey will need some assembly trampoline to set POR_EL0
> before invoking the real signal handler, as discussed here [3]. This is
> not ideal, but it makes experimentation with pkeys in signal handlers
> possible while waiting for a potential interface to control the pkey
> state when delivering a signal. See Pierre's reply [4] for more
> information about use-cases and a potential interface.
>
Apologize in advance that I'm unfamiliar with ARM's POR, up to review
this patch series, so I might ask silly questions or based on wrong
understanding.
It seems that the patch has the same logic as Aruna Ramakrishna
proposed for X86, is this correct ?
In the latest version of x86 change [1], I have a comment if we want
to consider adding a new flag SS_PKEYALTSTACK (see SS_AUTODISARM as an
example) in sigaltstack, and restrict this mechanism (overwriting
PKRU/POR_EL0 and sigframe) to sigaltstack() with SS_PKEYALTSTACK.
There is a subtle difference if we do that, i.e. the existing
signaling handling user might not care or do not use PKEY/POE, and
overwriting PKRU/POR_EL0 and sigframe every time will add extra CPU
time on the signaling delivery, which could be real-time sensitive.
Since I raised this comment on X86, I think raising it for ARM for
discussion would be ok,
it might make sense to have consistent API experience for arm/x86 here.
Thanks
-Jeff
[1] https://lore.kernel.org/lkml/CABi2SkWjF2Sicrr71=a6H8XJyf9q9L_Nd5FPp0CJ2mvB46Rrrg@mail.gmail.com/
> The x86 series also added kselftests to ensure that no spurious SIGSEGV
> occurs during signal delivery regardless of which pkey is accessible at
> the point where the signal is delivered. This series adapts those
> kselftests to allow running them on arm64 (patch 4-5). There is a
> dependency on Yury's PKEY_UNRESTRICTED patch [7] for patch 4
> specifically.
>
> Finally patch 2 is a clean-up following feedback on Joey's series [5].
>
> I have tested this series on arm64 and x86_64 (booting and running the
> protection_keys and pkey_sighandler_tests mm kselftests).
>
> - Kevin
>
> ---
>
> v2..v3:
> * Reordered patches (patch 1 is now the main patch).
> * Patch 1: compute por_enable_all with an explicit loop, based on
> arch_max_pkey() (suggestion from Dave M).
> * Patch 4: improved naming, replaced global pkey reg value with inline
> helper, made use of Yury's PKEY_UNRESTRICTED macro [7] (suggestions
> from Dave H).
>
> v2: https://lore.kernel.org/linux-arm-kernel/20241023150511.3923558-1-kevin.brodsky@arm.com/
>
> v1..v2:
> * In setup_rt_frame(), ensured that POR_EL0 is reset to its original
> value if we fail to deliver the signal (addresses Catalin's concern [6]).
> * Renamed *unpriv_access* to *user_access* in patch 3 (suggestion from
> Dave).
> * Made what patch 1-2 do explicit in the commit message body (suggestion
> from Dave).
>
> v1: https://lore.kernel.org/linux-arm-kernel/20241017133909.3837547-1-kevin.brodsky@arm.com/
>
> [1] https://lore.kernel.org/linux-arm-kernel/20240822151113.1479789-1-joey.gouly@arm.com/
> [2] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
> [3] https://lore.kernel.org/lkml/CABi2SkWxNkP2O7ipkP67WKz0-LV33e5brReevTTtba6oKUfHRw@mail.gmail.com/
> [4] https://lore.kernel.org/linux-arm-kernel/87plns8owh.fsf@arm.com/
> [5] https://lore.kernel.org/linux-arm-kernel/20241015114116.GA19334@willie-the-truck/
> [6] https://lore.kernel.org/linux-arm-kernel/Zw6D2waVyIwYE7wd@arm.com/
> [7] https://lore.kernel.org/all/20241028090715.509527-2-yury.khrustalev@arm.com/
>
> Cc: akpm@linux-foundation.org
> Cc: anshuman.khandual@arm.com
> Cc: aruna.ramakrishna@oracle.com
> Cc: broonie@kernel.org
> Cc: catalin.marinas@arm.com
> Cc: dave.hansen@linux.intel.com
> Cc: Dave.Martin@arm.com
> Cc: jeffxu@chromium.org
> Cc: joey.gouly@arm.com
> Cc: keith.lucas@oracle.com
> Cc: pierre.langlois@arm.com
> Cc: shuah@kernel.org
> Cc: sroettger@google.com
> Cc: tglx@linutronix.de
> Cc: will@kernel.org
> Cc: yury.khrustalev@arm.com
> Cc: linux-kselftest@vger.kernel.org
> Cc: x86@kernel.org
>
> Kevin Brodsky (5):
> arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
> arm64: signal: Remove unnecessary check when saving POE state
> arm64: signal: Remove unused macro
> selftests/mm: Use generic pkey register manipulation
> selftests/mm: Enable pkey_sighandler_tests on arm64
>
> arch/arm64/kernel/signal.c | 95 ++++++++++++---
> tools/testing/selftests/mm/Makefile | 8 +-
> tools/testing/selftests/mm/pkey-arm64.h | 1 +
> tools/testing/selftests/mm/pkey-x86.h | 2 +
> .../selftests/mm/pkey_sighandler_tests.c | 115 ++++++++++++++----
> 5 files changed, 176 insertions(+), 45 deletions(-)
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
2024-10-29 14:45 ` [PATCH v3 1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures Kevin Brodsky
@ 2024-10-30 22:01 ` Jeff Xu
2024-10-31 8:45 ` Kevin Brodsky
2024-10-31 9:33 ` Will Deacon
0 siblings, 2 replies; 17+ messages in thread
From: Jeff Xu @ 2024-10-30 22:01 UTC (permalink / raw)
To: Kevin Brodsky
Cc: linux-arm-kernel, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, joey.gouly,
keith.lucas, pierre.langlois, shuah, sroettger, tglx, will,
yury.khrustalev, linux-kselftest, x86, Kees Cook,
Jorge Lucangeli Obes, Jann Horn
+ Kees, Jorge, Jann
On Tue, Oct 29, 2024 at 7:46 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
>
> TL;DR: reset POR_EL0 to "allow all" before writing the signal frame,
> preventing spurious uaccess failures.
>
> When POE is supported, the POR_EL0 register constrains memory
> accesses based on the target page's POIndex (pkey). This raises the
> question: what constraints should apply to a signal handler? The
> current answer is that POR_EL0 is reset to POR_EL0_INIT when
> invoking the handler, giving it full access to POIndex 0. This is in
> line with x86's MPK support and remains unchanged.
>
> This is only part of the story, though. POR_EL0 constrains all
> unprivileged memory accesses, meaning that uaccess routines such as
> put_user() are also impacted. As a result POR_EL0 may prevent the
> signal frame from being written to the signal stack (ultimately
> causing a SIGSEGV). This is especially concerning when an alternate
> signal stack is used, because userspace may want to prevent access
> to it outside of signal handlers. There is currently no provision
> for that: POR_EL0 is reset after writing to the stack, and
> POR_EL0_INIT only enables access to POIndex 0.
>
> This patch ensures that POR_EL0 is reset to its most permissive
> state before the signal stack is accessed. Once the signal frame has
> been fully written, POR_EL0 is still set to POR_EL0_INIT - it is up
> to the signal handler to enable access to additional pkeys if
> needed. As to sigreturn(), it expects having access to the stack
> like any other syscall; we only need to ensure that POR_EL0 is
> restored from the signal frame after all uaccess calls. This
> approach is in line with the recent x86/pkeys series [1].
>
> Resetting POR_EL0 early introduces some complications, in that we
> can no longer read the register directly in preserve_poe_context().
> This is addressed by introducing a struct (user_access_state)
> and helpers to manage any such register impacting user accesses
> (uaccess and accesses in userspace). Things look like this on signal
> delivery:
>
> 1. Save original POR_EL0 into struct [save_reset_user_access_state()]
> 2. Set POR_EL0 to "allow all" [save_reset_user_access_state()]
> 3. Create signal frame
> 4. Write saved POR_EL0 value to the signal frame [preserve_poe_context()]
> 5. Finalise signal frame
> 6. If all operations succeeded:
> a. Set POR_EL0 to POR_EL0_INIT [set_handler_user_access_state()]
> b. Else reset POR_EL0 to its original value [restore_user_access_state()]
>
> If any step fails when setting up the signal frame, the process will
> be sent a SIGSEGV, which it may be able to handle. Step 6.b ensures
> that the original POR_EL0 is saved in the signal frame when
> delivering that SIGSEGV (so that the original value is restored by
> sigreturn).
>
> The return path (sys_rt_sigreturn) doesn't strictly require any change
> since restore_poe_context() is already called last. However, to
> avoid uaccess calls being accidentally added after that point, we
> use the same approach as in the delivery path, i.e. separating
> uaccess from writing to the register:
>
> 1. Read saved POR_EL0 value from the signal frame [restore_poe_context()]
> 2. Set POR_EL0 to the saved value [restore_user_access_state()]
>
> [1] https://lore.kernel.org/lkml/20240802061318.2140081-1-aruna.ramakrishna@oracle.com/
>
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
> ---
> arch/arm64/kernel/signal.c | 92 ++++++++++++++++++++++++++++++++------
> 1 file changed, 78 insertions(+), 14 deletions(-)
>
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index 561986947530..c7d311d8b92a 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -19,6 +19,7 @@
> #include <linux/ratelimit.h>
> #include <linux/rseq.h>
> #include <linux/syscalls.h>
> +#include <linux/pkeys.h>
>
> #include <asm/daifflags.h>
> #include <asm/debug-monitors.h>
> @@ -66,10 +67,63 @@ struct rt_sigframe_user_layout {
> unsigned long end_offset;
> };
>
> +/*
> + * Holds any EL0-controlled state that influences unprivileged memory accesses.
> + * This includes both accesses done in userspace and uaccess done in the kernel.
> + *
> + * This state needs to be carefully managed to ensure that it doesn't cause
> + * uaccess to fail when setting up the signal frame, and the signal handler
> + * itself also expects a well-defined state when entered.
> + */
> +struct user_access_state {
> + u64 por_el0;
> +};
> +
> #define BASE_SIGFRAME_SIZE round_up(sizeof(struct rt_sigframe), 16)
> #define TERMINATOR_SIZE round_up(sizeof(struct _aarch64_ctx), 16)
> #define EXTRA_CONTEXT_SIZE round_up(sizeof(struct extra_context), 16)
>
> +/*
> + * Save the user access state into ua_state and reset it to disable any
> + * restrictions.
> + */
> +static void save_reset_user_access_state(struct user_access_state *ua_state)
> +{
> + if (system_supports_poe()) {
> + u64 por_enable_all = 0;
> +
> + for (int pkey = 0; pkey < arch_max_pkey(); pkey++)
> + por_enable_all |= POE_RXW << (pkey * POR_BITS_PER_PKEY);
> +
> + ua_state->por_el0 = read_sysreg_s(SYS_POR_EL0);
> + write_sysreg_s(por_enable_all, SYS_POR_EL0);
> + /* Ensure that any subsequent uaccess observes the updated value */
> + isb();
> + }
> +}
> +
> +/*
> + * Set the user access state for invoking the signal handler.
> + *
> + * No uaccess should be done after that function is called.
> + */
> +static void set_handler_user_access_state(void)
> +{
> + if (system_supports_poe())
> + write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
> +}
> +
> +/*
> + * Restore the user access state to the values saved in ua_state.
> + *
> + * No uaccess should be done after that function is called.
> + */
> +static void restore_user_access_state(const struct user_access_state *ua_state)
> +{
> + if (system_supports_poe())
> + write_sysreg_s(ua_state->por_el0, SYS_POR_EL0);
> +}
> +
> static void init_user_layout(struct rt_sigframe_user_layout *user)
> {
> const size_t reserved_size =
> @@ -261,18 +315,20 @@ static int restore_fpmr_context(struct user_ctxs *user)
> return err;
> }
>
> -static int preserve_poe_context(struct poe_context __user *ctx)
> +static int preserve_poe_context(struct poe_context __user *ctx,
> + const struct user_access_state *ua_state)
> {
> int err = 0;
>
> __put_user_error(POE_MAGIC, &ctx->head.magic, err);
> __put_user_error(sizeof(*ctx), &ctx->head.size, err);
> - __put_user_error(read_sysreg_s(SYS_POR_EL0), &ctx->por_el0, err);
> + __put_user_error(ua_state->por_el0, &ctx->por_el0, err);
>
> return err;
> }
>
> -static int restore_poe_context(struct user_ctxs *user)
> +static int restore_poe_context(struct user_ctxs *user,
> + struct user_access_state *ua_state)
> {
> u64 por_el0;
> int err = 0;
> @@ -282,7 +338,7 @@ static int restore_poe_context(struct user_ctxs *user)
>
> __get_user_error(por_el0, &(user->poe->por_el0), err);
> if (!err)
> - write_sysreg_s(por_el0, SYS_POR_EL0);
> + ua_state->por_el0 = por_el0;
>
> return err;
> }
> @@ -850,7 +906,8 @@ static int parse_user_sigframe(struct user_ctxs *user,
> }
>
> static int restore_sigframe(struct pt_regs *regs,
> - struct rt_sigframe __user *sf)
> + struct rt_sigframe __user *sf,
> + struct user_access_state *ua_state)
> {
> sigset_t set;
> int i, err;
> @@ -899,7 +956,7 @@ static int restore_sigframe(struct pt_regs *regs,
> err = restore_zt_context(&user);
>
> if (err == 0 && system_supports_poe() && user.poe)
> - err = restore_poe_context(&user);
> + err = restore_poe_context(&user, ua_state);
>
> return err;
> }
> @@ -908,6 +965,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
> {
> struct pt_regs *regs = current_pt_regs();
> struct rt_sigframe __user *frame;
> + struct user_access_state ua_state;
>
> /* Always make any pending restarted system calls return -EINTR */
> current->restart_block.fn = do_no_restart_syscall;
> @@ -924,12 +982,14 @@ SYSCALL_DEFINE0(rt_sigreturn)
> if (!access_ok(frame, sizeof (*frame)))
> goto badframe;
>
> - if (restore_sigframe(regs, frame))
> + if (restore_sigframe(regs, frame, &ua_state))
> goto badframe;
>
> if (restore_altstack(&frame->uc.uc_stack))
> goto badframe;
>
Do you need to move restore_altstack ahead of restore_sigframe?
similar as x86 change [1],
the discussion for this happened in [2] [3]
[1] https://lore.kernel.org/lkml/20240802061318.2140081-5-aruna.ramakrishna@oracle.com/
[2] https://lore.kernel.org/lkml/20240425210540.3265342-1-jeffxu@chromium.org/
[3] https://lore.kernel.org/lkml/d0162c76c25bc8e1c876aebe8e243ff2e6862359.camel@intel.com/
Thanks
-Jeff
> + restore_user_access_state(&ua_state);
> +
> return regs->regs[0];
>
> badframe:
> @@ -1035,7 +1095,8 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
> }
>
> static int setup_sigframe(struct rt_sigframe_user_layout *user,
> - struct pt_regs *regs, sigset_t *set)
> + struct pt_regs *regs, sigset_t *set,
> + const struct user_access_state *ua_state)
> {
> int i, err = 0;
> struct rt_sigframe __user *sf = user->sigframe;
> @@ -1097,10 +1158,9 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
> struct poe_context __user *poe_ctx =
> apply_user_offset(user, user->poe_offset);
>
> - err |= preserve_poe_context(poe_ctx);
> + err |= preserve_poe_context(poe_ctx, ua_state);
> }
>
> -
> /* ZA state if present */
> if (system_supports_sme() && err == 0 && user->za_offset) {
> struct za_context __user *za_ctx =
> @@ -1237,9 +1297,6 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
> sme_smstop();
> }
>
> - if (system_supports_poe())
> - write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
> -
> if (ka->sa.sa_flags & SA_RESTORER)
> sigtramp = ka->sa.sa_restorer;
> else
> @@ -1253,6 +1310,7 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
> {
> struct rt_sigframe_user_layout user;
> struct rt_sigframe __user *frame;
> + struct user_access_state ua_state;
> int err = 0;
>
> fpsimd_signal_preserve_current_state();
> @@ -1260,13 +1318,14 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
> if (get_sigframe(&user, ksig, regs))
> return 1;
>
> + save_reset_user_access_state(&ua_state);
> frame = user.sigframe;
>
> __put_user_error(0, &frame->uc.uc_flags, err);
> __put_user_error(NULL, &frame->uc.uc_link, err);
>
> err |= __save_altstack(&frame->uc.uc_stack, regs->sp);
> - err |= setup_sigframe(&user, regs, set);
> + err |= setup_sigframe(&user, regs, set, &ua_state);
> if (err == 0) {
> setup_return(regs, &ksig->ka, &user, usig);
> if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
> @@ -1276,6 +1335,11 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
> }
> }
>
> + if (err == 0)
> + set_handler_user_access_state();
> + else
> + restore_user_access_state(&ua_state);
> +
> return err;
> }
>
> --
> 2.43.0
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
2024-10-30 22:01 ` Jeff Xu
@ 2024-10-31 8:45 ` Kevin Brodsky
2024-10-31 18:43 ` Jeff Xu
2024-10-31 9:33 ` Will Deacon
1 sibling, 1 reply; 17+ messages in thread
From: Kevin Brodsky @ 2024-10-31 8:45 UTC (permalink / raw)
To: Jeff Xu
Cc: linux-arm-kernel, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, joey.gouly,
keith.lucas, pierre.langlois, shuah, sroettger, tglx, will,
yury.khrustalev, linux-kselftest, x86, Kees Cook,
Jorge Lucangeli Obes, Jann Horn
On 30/10/2024 23:01, Jeff Xu wrote:
>> -static int restore_poe_context(struct user_ctxs *user)
>> +static int restore_poe_context(struct user_ctxs *user,
>> + struct user_access_state *ua_state)
>> {
>> u64 por_el0;
>> int err = 0;
>> @@ -282,7 +338,7 @@ static int restore_poe_context(struct user_ctxs *user)
>>
>> __get_user_error(por_el0, &(user->poe->por_el0), err);
>> if (!err)
>> - write_sysreg_s(por_el0, SYS_POR_EL0);
>> + ua_state->por_el0 = por_el0;
>>
>> return err;
>> }
>> @@ -850,7 +906,8 @@ static int parse_user_sigframe(struct user_ctxs *user,
>> }
>>
>> static int restore_sigframe(struct pt_regs *regs,
>> - struct rt_sigframe __user *sf)
>> + struct rt_sigframe __user *sf,
>> + struct user_access_state *ua_state)
>> {
>> sigset_t set;
>> int i, err;
>> @@ -899,7 +956,7 @@ static int restore_sigframe(struct pt_regs *regs,
>> err = restore_zt_context(&user);
>>
>> if (err == 0 && system_supports_poe() && user.poe)
>> - err = restore_poe_context(&user);
>> + err = restore_poe_context(&user, ua_state);
>>
>> return err;
>> }
>> @@ -908,6 +965,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
>> {
>> struct pt_regs *regs = current_pt_regs();
>> struct rt_sigframe __user *frame;
>> + struct user_access_state ua_state;
>>
>> /* Always make any pending restarted system calls return -EINTR */
>> current->restart_block.fn = do_no_restart_syscall;
>> @@ -924,12 +982,14 @@ SYSCALL_DEFINE0(rt_sigreturn)
>> if (!access_ok(frame, sizeof (*frame)))
>> goto badframe;
>>
>> - if (restore_sigframe(regs, frame))
>> + if (restore_sigframe(regs, frame, &ua_state))
>> goto badframe;
>>
>> if (restore_altstack(&frame->uc.uc_stack))
>> goto badframe;
>>
> Do you need to move restore_altstack ahead of restore_sigframe?
This is not necessary because restore_sigframe() no longer writes to
POR_EL0. restore_poe_context() (above) now saves the original POR_EL0
value into ua_state, and it is restore_user_access_state() (called below
just before returning to userspace) that actually writes to POR_EL0,
after all uaccess is completed.
Having said that, I somehow missed the call to restore_altstack() when
writing the commit message, so these changes in sys_rt_sigreturn are in
fact necessary. Good catch! At least the patch itself should be doing
the right thing.
- Kevin
> similar as x86 change [1],
> the discussion for this happened in [2] [3]
>
> [1] https://lore.kernel.org/lkml/20240802061318.2140081-5-aruna.ramakrishna@oracle.com/
> [2] https://lore.kernel.org/lkml/20240425210540.3265342-1-jeffxu@chromium.org/
> [3] https://lore.kernel.org/lkml/d0162c76c25bc8e1c876aebe8e243ff2e6862359.camel@intel.com/
>
> Thanks
> -Jeff
>
>
>> + restore_user_access_state(&ua_state);
>> +
>> return regs->regs[0];
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
2024-10-30 22:01 ` Jeff Xu
2024-10-31 8:45 ` Kevin Brodsky
@ 2024-10-31 9:33 ` Will Deacon
2024-10-31 10:23 ` Kevin Brodsky
1 sibling, 1 reply; 17+ messages in thread
From: Will Deacon @ 2024-10-31 9:33 UTC (permalink / raw)
To: Jeff Xu
Cc: Kevin Brodsky, linux-arm-kernel, akpm, anshuman.khandual,
aruna.ramakrishna, broonie, catalin.marinas, dave.hansen,
Dave.Martin, joey.gouly, keith.lucas, pierre.langlois, shuah,
sroettger, tglx, yury.khrustalev, linux-kselftest, x86, Kees Cook,
Jorge Lucangeli Obes, Jann Horn
Hi Jeff,
Thanks for chiming in!
On Wed, Oct 30, 2024 at 03:01:53PM -0700, Jeff Xu wrote:
> On Tue, Oct 29, 2024 at 7:46 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
> >
> > TL;DR: reset POR_EL0 to "allow all" before writing the signal frame,
> > preventing spurious uaccess failures.
[...]
> > @@ -924,12 +982,14 @@ SYSCALL_DEFINE0(rt_sigreturn)
> > if (!access_ok(frame, sizeof (*frame)))
> > goto badframe;
> >
> > - if (restore_sigframe(regs, frame))
> > + if (restore_sigframe(regs, frame, &ua_state))
> > goto badframe;
> >
> > if (restore_altstack(&frame->uc.uc_stack))
> > goto badframe;
> >
> Do you need to move restore_altstack ahead of restore_sigframe?
> similar as x86 change [1],
> the discussion for this happened in [2] [3]
>
> [1] https://lore.kernel.org/lkml/20240802061318.2140081-5-aruna.ramakrishna@oracle.com/
> [2] https://lore.kernel.org/lkml/20240425210540.3265342-1-jeffxu@chromium.org/
> [3] https://lore.kernel.org/lkml/d0162c76c25bc8e1c876aebe8e243ff2e6862359.camel@intel.com/
>
> > + restore_user_access_state(&ua_state);
The POR isn't restored until here ^^^, so I _think_ restore_altstack()
is fine where it is. Kevin, can you confirm, please?
Will
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
2024-10-31 9:33 ` Will Deacon
@ 2024-10-31 10:23 ` Kevin Brodsky
0 siblings, 0 replies; 17+ messages in thread
From: Kevin Brodsky @ 2024-10-31 10:23 UTC (permalink / raw)
To: Will Deacon, Jeff Xu
Cc: linux-arm-kernel, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, joey.gouly,
keith.lucas, pierre.langlois, shuah, sroettger, tglx,
yury.khrustalev, linux-kselftest, x86, Kees Cook,
Jorge Lucangeli Obes, Jann Horn
On 31/10/2024 10:33, Will Deacon wrote:
> Hi Jeff,
>
> Thanks for chiming in!
>
> On Wed, Oct 30, 2024 at 03:01:53PM -0700, Jeff Xu wrote:
>> On Tue, Oct 29, 2024 at 7:46 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
>>> TL;DR: reset POR_EL0 to "allow all" before writing the signal frame,
>>> preventing spurious uaccess failures.
> [...]
>
>>> @@ -924,12 +982,14 @@ SYSCALL_DEFINE0(rt_sigreturn)
>>> if (!access_ok(frame, sizeof (*frame)))
>>> goto badframe;
>>>
>>> - if (restore_sigframe(regs, frame))
>>> + if (restore_sigframe(regs, frame, &ua_state))
>>> goto badframe;
>>>
>>> if (restore_altstack(&frame->uc.uc_stack))
>>> goto badframe;
>>>
>> Do you need to move restore_altstack ahead of restore_sigframe?
>> similar as x86 change [1],
>> the discussion for this happened in [2] [3]
>>
>> [1] https://lore.kernel.org/lkml/20240802061318.2140081-5-aruna.ramakrishna@oracle.com/
>> [2] https://lore.kernel.org/lkml/20240425210540.3265342-1-jeffxu@chromium.org/
>> [3] https://lore.kernel.org/lkml/d0162c76c25bc8e1c876aebe8e243ff2e6862359.camel@intel.com/
>>
>>> + restore_user_access_state(&ua_state);
> The POR isn't restored until here ^^^, so I _think_ restore_altstack()
> is fine where it is. Kevin, can you confirm, please?
Yes, that's correct, see my earlier reply [1].
- Kevin
[1]
https://lore.kernel.org/all/cd0e114d-57eb-4c90-bb6f-9abf0cc8920f@arm.com/
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery
2024-10-30 21:59 ` Jeff Xu
@ 2024-10-31 13:00 ` Kevin Brodsky
2024-10-31 16:52 ` Jeff Xu
0 siblings, 1 reply; 17+ messages in thread
From: Kevin Brodsky @ 2024-10-31 13:00 UTC (permalink / raw)
To: Jeff Xu
Cc: linux-arm-kernel, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, joey.gouly,
keith.lucas, pierre.langlois, shuah, sroettger, tglx, will,
yury.khrustalev, linux-kselftest, x86, Kees Cook,
Jorge Lucangeli Obes, Jann Horn
On 30/10/2024 22:59, Jeff Xu wrote:
> Apologize in advance that I'm unfamiliar with ARM's POR, up to review
> this patch series, so I might ask silly questions or based on wrong
> understanding.
That's no problem, your input is very welcome! There is no fundamental
difference between POR and PKRU AFAIK - the encoding is different, but
the principle is the same. The main thing to keep in mind is that POE
(the arm64 extension) allows restricting execution in addition to
read/write.
> It seems that the patch has the same logic as Aruna Ramakrishna
> proposed for X86, is this correct ?
Yes, patch 1 aims at aligning arm64 with x86 (same behaviour). Going
forward I think we should try and keep the arm64 and x86 handling of
pkeys as consistent as possible.
> In the latest version of x86 change [1], I have a comment if we want
> to consider adding a new flag SS_PKEYALTSTACK (see SS_AUTODISARM as an
> example) in sigaltstack, and restrict this mechanism (overwriting
> PKRU/POR_EL0 and sigframe) to sigaltstack() with SS_PKEYALTSTACK.
> There is a subtle difference if we do that, i.e. the existing
> signaling handling user might not care or do not use PKEY/POE, and
> overwriting PKRU/POR_EL0 and sigframe every time will add extra CPU
> time on the signaling delivery, which could be real-time sensitive.
From a purely functional perspective, resetting POR to allow access to
all pkeys before writing the signal frame should be safe in any context,
and allows keeping the handling simple (no conditional code). The
performance aspect is a fair point though, as we are adding an ISB
(synchronisation barrier) on the signal delivery path if POE is supported.
> Since I raised this comment on X86, I think raising it for ARM for
> discussion would be ok,
> it might make sense to have consistent API experience for arm/x86 here.
And indeed this is what I think is most important at this point.
Considering that Aruna's series resets PKRU unconditionally (sigaltstack
or not) and has already been pulled into mainline during 6.12-rc1 [2], I
still believe that patch 1 is doing the right thing, i.e. aligning arm64
with x86. If the concern with performance is confirmed, and there is an
agreement to reset POR/PKRU less eagerly on both architectures, this
could potentially be revisited.
- Kevin
[2]
https://lore.kernel.org/lkml/172656199227.2471820.13578261908219597067.tglx@xen13/
> [1] https://lore.kernel.org/lkml/CABi2SkWjF2Sicrr71=a6H8XJyf9q9L_Nd5FPp0CJ2mvB46Rrrg@mail.gmail.com/
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery
2024-10-31 13:00 ` Kevin Brodsky
@ 2024-10-31 16:52 ` Jeff Xu
0 siblings, 0 replies; 17+ messages in thread
From: Jeff Xu @ 2024-10-31 16:52 UTC (permalink / raw)
To: Kevin Brodsky
Cc: linux-arm-kernel, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, joey.gouly,
keith.lucas, pierre.langlois, shuah, sroettger, tglx, will,
yury.khrustalev, linux-kselftest, x86, Kees Cook,
Jorge Lucangeli Obes, Jann Horn
On Thu, Oct 31, 2024 at 6:00 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
>
> On 30/10/2024 22:59, Jeff Xu wrote:
> > Apologize in advance that I'm unfamiliar with ARM's POR, up to review
> > this patch series, so I might ask silly questions or based on wrong
> > understanding.
>
> That's no problem, your input is very welcome! There is no fundamental
> difference between POR and PKRU AFAIK - the encoding is different, but
> the principle is the same. The main thing to keep in mind is that POE
> (the arm64 extension) allows restricting execution in addition to
> read/write.
>
> > It seems that the patch has the same logic as Aruna Ramakrishna
> > proposed for X86, is this correct ?
>
> Yes, patch 1 aims at aligning arm64 with x86 (same behaviour). Going
> forward I think we should try and keep the arm64 and x86 handling of
> pkeys as consistent as possible.
>
> > In the latest version of x86 change [1], I have a comment if we want
> > to consider adding a new flag SS_PKEYALTSTACK (see SS_AUTODISARM as an
> > example) in sigaltstack, and restrict this mechanism (overwriting
> > PKRU/POR_EL0 and sigframe) to sigaltstack() with SS_PKEYALTSTACK.
> > There is a subtle difference if we do that, i.e. the existing
> > signaling handling user might not care or do not use PKEY/POE, and
> > overwriting PKRU/POR_EL0 and sigframe every time will add extra CPU
> > time on the signaling delivery, which could be real-time sensitive.
>
> From a purely functional perspective, resetting POR to allow access to
> all pkeys before writing the signal frame should be safe in any context,
> and allows keeping the handling simple (no conditional code). The
> performance aspect is a fair point though, as we are adding an ISB
> (synchronisation barrier) on the signal delivery path if POE is supported.
>
Yes. The functional level is the same.
Having worked on a read-time system a bit in the past, I'm aware that
signaling handling paths are real-time sensitive.
> > Since I raised this comment on X86, I think raising it for ARM for
> > discussion would be ok,
> > it might make sense to have consistent API experience for arm/x86 here.
>
> And indeed this is what I think is most important at this point.
> Considering that Aruna's series resets PKRU unconditionally (sigaltstack
> or not) and has already been pulled into mainline during 6.12-rc1 [2], I
> still believe that patch 1 is doing the right thing, i.e. aligning arm64
> with x86. If the concern with performance is confirmed, and there is an
> agreement to reset POR/PKRU less eagerly on both architectures, this
> could potentially be revisited.
>
Oh, I didn't know it was already in main. My information is out-dated.
It does feel a little rushed because my comment on the performance
perspective isn't addressed/responded.
-Jeff
> - Kevin
>
> [2]
> https://lore.kernel.org/lkml/172656199227.2471820.13578261908219597067.tglx@xen13/
>
> > [1] https://lore.kernel.org/lkml/CABi2SkWjF2Sicrr71=a6H8XJyf9q9L_Nd5FPp0CJ2mvB46Rrrg@mail.gmail.com/
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v3 1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures
2024-10-31 8:45 ` Kevin Brodsky
@ 2024-10-31 18:43 ` Jeff Xu
0 siblings, 0 replies; 17+ messages in thread
From: Jeff Xu @ 2024-10-31 18:43 UTC (permalink / raw)
To: Kevin Brodsky
Cc: linux-arm-kernel, akpm, anshuman.khandual, aruna.ramakrishna,
broonie, catalin.marinas, dave.hansen, Dave.Martin, joey.gouly,
keith.lucas, pierre.langlois, shuah, sroettger, tglx, will,
yury.khrustalev, linux-kselftest, x86, Kees Cook,
Jorge Lucangeli Obes, Jann Horn
On Thu, Oct 31, 2024 at 1:45 AM Kevin Brodsky <kevin.brodsky@arm.com> wrote:
>
> On 30/10/2024 23:01, Jeff Xu wrote:
> >> -static int restore_poe_context(struct user_ctxs *user)
> >> +static int restore_poe_context(struct user_ctxs *user,
> >> + struct user_access_state *ua_state)
> >> {
> >> u64 por_el0;
> >> int err = 0;
> >> @@ -282,7 +338,7 @@ static int restore_poe_context(struct user_ctxs *user)
> >>
> >> __get_user_error(por_el0, &(user->poe->por_el0), err);
> >> if (!err)
> >> - write_sysreg_s(por_el0, SYS_POR_EL0);
> >> + ua_state->por_el0 = por_el0;
> >>
> >> return err;
> >> }
> >> @@ -850,7 +906,8 @@ static int parse_user_sigframe(struct user_ctxs *user,
> >> }
> >>
> >> static int restore_sigframe(struct pt_regs *regs,
> >> - struct rt_sigframe __user *sf)
> >> + struct rt_sigframe __user *sf,
> >> + struct user_access_state *ua_state)
> >> {
> >> sigset_t set;
> >> int i, err;
> >> @@ -899,7 +956,7 @@ static int restore_sigframe(struct pt_regs *regs,
> >> err = restore_zt_context(&user);
> >>
> >> if (err == 0 && system_supports_poe() && user.poe)
> >> - err = restore_poe_context(&user);
> >> + err = restore_poe_context(&user, ua_state);
> >>
> >> return err;
> >> }
> >> @@ -908,6 +965,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
> >> {
> >> struct pt_regs *regs = current_pt_regs();
> >> struct rt_sigframe __user *frame;
> >> + struct user_access_state ua_state;
> >>
> >> /* Always make any pending restarted system calls return -EINTR */
> >> current->restart_block.fn = do_no_restart_syscall;
> >> @@ -924,12 +982,14 @@ SYSCALL_DEFINE0(rt_sigreturn)
> >> if (!access_ok(frame, sizeof (*frame)))
> >> goto badframe;
> >>
> >> - if (restore_sigframe(regs, frame))
> >> + if (restore_sigframe(regs, frame, &ua_state))
> >> goto badframe;
> >>
> >> if (restore_altstack(&frame->uc.uc_stack))
> >> goto badframe;
> >>
> > Do you need to move restore_altstack ahead of restore_sigframe?
>
> This is not necessary because restore_sigframe() no longer writes to
> POR_EL0. restore_poe_context() (above) now saves the original POR_EL0
> value into ua_state, and it is restore_user_access_state() (called below
> just before returning to userspace) that actually writes to POR_EL0,
> after all uaccess is completed.
>
Got it, thanks for the explanation.
-Jeff
> Having said that, I somehow missed the call to restore_altstack() when
> writing the commit message, so these changes in sys_rt_sigreturn are in
> fact necessary. Good catch! At least the patch itself should be doing
> the right thing.
>
> - Kevin
>
> > similar as x86 change [1],
> > the discussion for this happened in [2] [3]
> >
> > [1] https://lore.kernel.org/lkml/20240802061318.2140081-5-aruna.ramakrishna@oracle.com/
> > [2] https://lore.kernel.org/lkml/20240425210540.3265342-1-jeffxu@chromium.org/
> > [3] https://lore.kernel.org/lkml/d0162c76c25bc8e1c876aebe8e243ff2e6862359.camel@intel.com/
> >
> > Thanks
> > -Jeff
> >
> >
> >> + restore_user_access_state(&ua_state);
> >> +
> >> return regs->regs[0];
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: (subset) [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery
2024-10-29 14:45 [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Kevin Brodsky
` (6 preceding siblings ...)
2024-10-30 21:59 ` Jeff Xu
@ 2024-11-04 17:12 ` Catalin Marinas
7 siblings, 0 replies; 17+ messages in thread
From: Catalin Marinas @ 2024-11-04 17:12 UTC (permalink / raw)
To: linux-arm-kernel, Kevin Brodsky
Cc: Will Deacon, akpm, anshuman.khandual, aruna.ramakrishna, broonie,
dave.hansen, Dave.Martin, jeffxu, joey.gouly, keith.lucas,
pierre.langlois, shuah, sroettger, tglx, yury.khrustalev,
linux-kselftest, x86
On Tue, 29 Oct 2024 14:45:34 +0000, Kevin Brodsky wrote:
> This series is a follow-up to Joey's Permission Overlay Extension (POE)
> series [1] that recently landed on mainline. The goal is to improve the
> way we handle the register that governs which pkeys/POIndex are
> accessible (POR_EL0) during signal delivery. As things stand, we may
> unexpectedly fail to write the signal frame on the stack because POR_EL0
> is not reset before the uaccess operations. See patch 1 for more details
> and the main changes this series brings.
>
> [...]
Applied to arm64 (for-next/pkey-signal), thanks!
I took the kselftest patches through the arm64 tree as well (patch 4
acked by Dave Hansen from an x86 angle). Patch 1 has already been merged
as a fix (this branch is based on top of the arm64 for-next/fixes one
which contains patch 1).
[2/5] arm64: signal: Remove unnecessary check when saving POE state
https://git.kernel.org/arm64/c/466ece4c6e19
[3/5] arm64: signal: Remove unused macro
https://git.kernel.org/arm64/c/8edbbfcc1ed3
[4/5] selftests/mm: Use generic pkey register manipulation
https://git.kernel.org/arm64/c/6e182dc9f268
[5/5] selftests/mm: Enable pkey_sighandler_tests on arm64
https://git.kernel.org/arm64/c/49f59573e9e0
--
Catalin
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-11-04 17:48 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-29 14:45 [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Kevin Brodsky
2024-10-29 14:45 ` [PATCH v3 1/5] arm64: signal: Improve POR_EL0 handling to avoid uaccess failures Kevin Brodsky
2024-10-30 22:01 ` Jeff Xu
2024-10-31 8:45 ` Kevin Brodsky
2024-10-31 18:43 ` Jeff Xu
2024-10-31 9:33 ` Will Deacon
2024-10-31 10:23 ` Kevin Brodsky
2024-10-29 14:45 ` [PATCH v3 2/5] arm64: signal: Remove unnecessary check when saving POE state Kevin Brodsky
2024-10-29 14:45 ` [PATCH v3 3/5] arm64: signal: Remove unused macro Kevin Brodsky
2024-10-29 14:45 ` [PATCH v3 4/5] selftests/mm: Use generic pkey register manipulation Kevin Brodsky
2024-10-29 17:42 ` Dave Hansen
2024-10-29 14:45 ` [PATCH v3 5/5] selftests/mm: Enable pkey_sighandler_tests on arm64 Kevin Brodsky
2024-10-29 18:28 ` [PATCH v3 0/5] Improve arm64 pkeys handling in signal delivery Will Deacon
2024-10-30 21:59 ` Jeff Xu
2024-10-31 13:00 ` Kevin Brodsky
2024-10-31 16:52 ` Jeff Xu
2024-11-04 17:12 ` (subset) " Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).