linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 0/2] static call support for arm64
@ 2021-10-27 23:34 Ard Biesheuvel
  2021-10-27 23:34 ` [PATCH v5 1/2] static_call: force symbol references with external linkage for CFI/LTO Ard Biesheuvel
  2021-10-27 23:34 ` [PATCH v5 2/2] arm64: implement support for static call trampolines Ard Biesheuvel
  0 siblings, 2 replies; 3+ messages in thread
From: Ard Biesheuvel @ 2021-10-27 23:34 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Ard Biesheuvel, Mark Rutland, Quentin Perret, Catalin Marinas,
	James Morse, Will Deacon, Frederic Weisbecker, Peter Zijlstra,
	Kees Cook

Changes since v4:
- add preparatory patch to address generic CFI/LTO issues with static
  calls
- add comment to patch #2 describing the trampoline layout
- add handling of Clang CFI jump table entries
- add PeterZ's ack to patch #2

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Quentin Perret <qperret@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>

Ard Biesheuvel (2):
  static_call: force symbol references with external linkage for CFI/LTO
  arm64: implement support for static call trampolines

 arch/arm64/Kconfig                   |  1 +
 arch/arm64/include/asm/static_call.h | 40 +++++++++++
 arch/arm64/kernel/patching.c         | 72 +++++++++++++++++++-
 arch/arm64/kernel/vmlinux.lds.S      |  1 +
 include/linux/static_call.h          | 21 +++++-
 5 files changed, 130 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/include/asm/static_call.h

-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v5 1/2] static_call: force symbol references with external linkage for CFI/LTO
  2021-10-27 23:34 [PATCH v5 0/2] static call support for arm64 Ard Biesheuvel
@ 2021-10-27 23:34 ` Ard Biesheuvel
  2021-10-27 23:34 ` [PATCH v5 2/2] arm64: implement support for static call trampolines Ard Biesheuvel
  1 sibling, 0 replies; 3+ messages in thread
From: Ard Biesheuvel @ 2021-10-27 23:34 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Ard Biesheuvel, Mark Rutland, Quentin Perret, Catalin Marinas,
	James Morse, Will Deacon, Frederic Weisbecker, Peter Zijlstra,
	Kees Cook

When building with Clang with CFI or LTO enabled, the linker may decide
not to emit function symbols with static linkage at all, or emit them
under a different symbol name. This breaks static calls, given that we
refer to such functions both from C code and from assembler, and we
expect the names to be the same.

So let's force the use of an alias with external linkage in a way that
is visible to the compiler. This ensures that the C name and the asm
name are identical.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 include/linux/static_call.h | 21 ++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/include/linux/static_call.h b/include/linux/static_call.h
index 3e56a9751c06..19dc210214c0 100644
--- a/include/linux/static_call.h
+++ b/include/linux/static_call.h
@@ -327,10 +327,27 @@ static inline int static_call_text_reserved(void *start, void *end)
 
 #endif /* CONFIG_HAVE_STATIC_CALL */
 
+#ifdef CONFIG_LTO
+/*
+ * DEFINE_STATIC_CALL() accepts any function symbol reference for its _func
+ * argument, but this may cause problems under Clang LTO/CFI if the function
+ * symbol has static linkage, because the symbol names exposed at the
+ * asm/object level may deviate from the C names. So let's force the reference
+ * to go via an alias with external linkage instead.
+ */
+#define _DEFINE_STATIC_CALL(name, _func, _init, _alias)			\
+	extern typeof(_func) _alias __alias(_init);			\
+	__DEFINE_STATIC_CALL(name, _func, _alias)
+#else
+#define _DEFINE_STATIC_CALL(name, _func, _init, _alias)			\
+	__DEFINE_STATIC_CALL(name, _func, _init)
+#endif
+
 #define DEFINE_STATIC_CALL(name, _func)					\
-	__DEFINE_STATIC_CALL(name, _func, _func)
+	_DEFINE_STATIC_CALL(name, _func, _func, __UNIQUE_ID(_func))
 
 #define DEFINE_STATIC_CALL_RET0(name, _func)				\
-	__DEFINE_STATIC_CALL(name, _func, __static_call_return0)
+	_DEFINE_STATIC_CALL(name, _func, __static_call_return0,		\
+			    __UNIQUE_ID(_func))
 
 #endif /* _LINUX_STATIC_CALL_H */
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH v5 2/2] arm64: implement support for static call trampolines
  2021-10-27 23:34 [PATCH v5 0/2] static call support for arm64 Ard Biesheuvel
  2021-10-27 23:34 ` [PATCH v5 1/2] static_call: force symbol references with external linkage for CFI/LTO Ard Biesheuvel
@ 2021-10-27 23:34 ` Ard Biesheuvel
  1 sibling, 0 replies; 3+ messages in thread
From: Ard Biesheuvel @ 2021-10-27 23:34 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Ard Biesheuvel, Mark Rutland, Quentin Perret, Catalin Marinas,
	James Morse, Will Deacon, Frederic Weisbecker, Peter Zijlstra,
	Kees Cook

Implement arm64 support for the 'unoptimized' static call variety, which
routes all calls through a single trampoline that is patched to perform a
tail call to the selected function.

It is expected that the direct branch instruction will be able to cover
the common case. However, given that static call targets may be located
in modules loaded out of direct branching range, we need a fallback path
that loads the address into R16 and uses a branch-to-register (BR)
instruction to perform an indirect call.

Unlike on x86, there is no pressing need on arm64 to avoid indirect
calls at all cost, but hiding it from the compiler as is done here does
have some benefits:
- the literal is located in .text, which gives us the same robustness
  advantage that code patching does;
- no performance hit on CFI enabled Clang builds that decorate compiler
  emitted indirect calls with branch target validity checks.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm64/Kconfig                   |  1 +
 arch/arm64/include/asm/static_call.h | 40 +++++++++++
 arch/arm64/kernel/patching.c         | 72 +++++++++++++++++++-
 arch/arm64/kernel/vmlinux.lds.S      |  1 +
 4 files changed, 111 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 228f39a35908..d9caa83b0f9f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -193,6 +193,7 @@ config ARM64
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_PREEMPT_DYNAMIC
 	select HAVE_REGS_AND_STACK_ACCESS_API
+	select HAVE_STATIC_CALL
 	select HAVE_FUNCTION_ARG_ACCESS_API
 	select HAVE_FUTEX_CMPXCHG if FUTEX
 	select MMU_GATHER_RCU_TABLE_FREE
diff --git a/arch/arm64/include/asm/static_call.h b/arch/arm64/include/asm/static_call.h
new file mode 100644
index 000000000000..b8b168174c52
--- /dev/null
+++ b/arch/arm64/include/asm/static_call.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_STATIC_CALL_H
+#define _ASM_STATIC_CALL_H
+
+/*
+ * The sequence below is laid out in a way that guarantees that the literal and
+ * the instruction are always covered by the same cacheline, and can be updated
+ * using a single store-pair instruction (if we rewrite the BTI C instruction
+ * as well). This means the literal and the instruction are always in sync when
+ * observed via the D-side.
+ *
+ * However, this does not guarantee that the I-side will catch up immediately
+ * as well: until the I-cache maintenance completes, CPUs may branch to the old
+ * target, or execute a stale NOP or RET. We deal with this by writing the
+ * literal unconditionally, even if it is 0x0 or the branch is in range. That
+ * way, a stale NOP will fall through and call the new target via an indirect
+ * call. Stale RETs or Bs will be taken as before, and branch to the old
+ * target until the I-side catches up.
+ */
+#define __ARCH_DEFINE_STATIC_CALL_TRAMP(name, insn)			    \
+	asm("	.pushsection	.static_call.text, \"ax\"		\n" \
+	    "	.align		4					\n" \
+	    "	.globl		" STATIC_CALL_TRAMP_STR(name) "		\n" \
+	    "0:	.quad	0x0						\n" \
+	    STATIC_CALL_TRAMP_STR(name) ":				\n" \
+	    "	hint 	34	/* BTI C */				\n" \
+		insn "							\n" \
+	    "	ldr	x16, 0b						\n" \
+	    "	cbz	x16, 1f						\n" \
+	    "	br	x16						\n" \
+	    "1:	ret							\n" \
+	    "	.popsection						\n")
+
+#define ARCH_DEFINE_STATIC_CALL_TRAMP(name, func)			\
+	__ARCH_DEFINE_STATIC_CALL_TRAMP(name, "b " #func)
+
+#define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)			\
+	__ARCH_DEFINE_STATIC_CALL_TRAMP(name, "ret")
+
+#endif /* _ASM_STATIC_CALL_H */
diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c
index 771f543464e0..646d1bd16482 100644
--- a/arch/arm64/kernel/patching.c
+++ b/arch/arm64/kernel/patching.c
@@ -66,7 +66,7 @@ int __kprobes aarch64_insn_read(void *addr, u32 *insnp)
 	return ret;
 }
 
-static int __kprobes __aarch64_insn_write(void *addr, __le32 insn)
+static int __kprobes __aarch64_insn_write(void *addr, void *insn, int size)
 {
 	void *waddr = addr;
 	unsigned long flags = 0;
@@ -75,7 +75,7 @@ static int __kprobes __aarch64_insn_write(void *addr, __le32 insn)
 	raw_spin_lock_irqsave(&patch_lock, flags);
 	waddr = patch_map(addr, FIX_TEXT_POKE0);
 
-	ret = copy_to_kernel_nofault(waddr, &insn, AARCH64_INSN_SIZE);
+	ret = copy_to_kernel_nofault(waddr, insn, size);
 
 	patch_unmap(FIX_TEXT_POKE0);
 	raw_spin_unlock_irqrestore(&patch_lock, flags);
@@ -85,7 +85,73 @@ static int __kprobes __aarch64_insn_write(void *addr, __le32 insn)
 
 int __kprobes aarch64_insn_write(void *addr, u32 insn)
 {
-	return __aarch64_insn_write(addr, cpu_to_le32(insn));
+	__le32 i = cpu_to_le32(insn);
+
+	return __aarch64_insn_write(addr, &i, AARCH64_INSN_SIZE);
+}
+
+static void *strip_cfi_jt(void *addr)
+{
+	if (IS_ENABLED(CONFIG_CFI_CLANG)) {
+		/*
+		 * Taking the address of a function produces the address of the
+		 * jump table entry when Clang CFI is enabled. Such entries are
+		 * ordinary jump instructions, so if we spot one of those, we
+		 * should decode it and use the address of the target instead.
+		 */
+		u32 br = le32_to_cpup(addr);
+
+		if (aarch64_insn_is_b(br))
+			return addr + aarch64_get_branch_offset(br);
+	}
+	return addr;
+}
+
+void arch_static_call_transform(void *site, void *tramp, void *func, bool tail)
+{
+	/*
+	 * -0x8	<literal>
+	 *  0x0	bti c		<--- trampoline entry point
+	 *  0x4	<branch or nop>
+	 *  0x8	ldr x16, <literal>
+	 *  0xc	cbz x16, 20
+	 * 0x10	br x16
+	 * 0x14	ret
+	 */
+	struct {
+		u64	literal;
+		__le32	insn[2];
+	} insns;
+	u32 insn;
+	int ret;
+
+	tramp = strip_cfi_jt(tramp);
+
+	insn = aarch64_insn_gen_hint(AARCH64_INSN_HINT_BTIC);
+	insns.literal = (u64)func;
+	insns.insn[0] = cpu_to_le32(insn);
+
+	if (!func) {
+		insn = aarch64_insn_gen_branch_reg(AARCH64_INSN_REG_LR,
+						   AARCH64_INSN_BRANCH_RETURN);
+	} else {
+		func = strip_cfi_jt(func);
+
+		insn = aarch64_insn_gen_branch_imm((u64)tramp + 4, (u64)func,
+						   AARCH64_INSN_BRANCH_NOLINK);
+
+		/*
+		 * Use a NOP if the branch target is out of range, and rely on
+		 * the indirect call instead.
+		 */
+		if (insn == AARCH64_BREAK_FAULT)
+			insn = aarch64_insn_gen_hint(AARCH64_INSN_HINT_NOP);
+	}
+	insns.insn[1] = cpu_to_le32(insn);
+
+	ret = __aarch64_insn_write(tramp - 8, &insns, sizeof(insns));
+	if (!WARN_ON(ret))
+		caches_clean_inval_pou((u64)tramp - 8, sizeof(insns));
 }
 
 int __kprobes aarch64_insn_patch_text_nosync(void *addr, u32 insn)
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index f6b1a88245db..ceb35c35192c 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -161,6 +161,7 @@ SECTIONS
 			IDMAP_TEXT
 			HIBERNATE_TEXT
 			TRAMP_TEXT
+			STATIC_CALL_TEXT
 			*(.fixup)
 			*(.gnu.warning)
 		. = ALIGN(16);
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-10-27 23:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-27 23:34 [PATCH v5 0/2] static call support for arm64 Ard Biesheuvel
2021-10-27 23:34 ` [PATCH v5 1/2] static_call: force symbol references with external linkage for CFI/LTO Ard Biesheuvel
2021-10-27 23:34 ` [PATCH v5 2/2] arm64: implement support for static call trampolines Ard Biesheuvel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).