* [PATCH v2 0/2] arm64/module: Enable late module relocations.
@ 2025-04-12 1:09 Dylan Hatch
2025-04-12 1:09 ` [PATCH v2 1/2] arm64: patching: Rename aarch64_insn_copy to text_poke Dylan Hatch
2025-04-12 1:09 ` [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations Dylan Hatch
0 siblings, 2 replies; 15+ messages in thread
From: Dylan Hatch @ 2025-04-12 1:09 UTC (permalink / raw)
To: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas, Will Deacon,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Song Liu, Ard Biesheuvel,
Mark Rutland
Cc: linux-arm-kernel, linux-kernel, Dylan Hatch, Roman Gushchin
Late relocations (after the module is initially loaded) are needed when
livepatches change module code. This is supported by x86, ppc, and s390.
This series borrows the x86 methodology to reach the same level of
suuport on arm64.
Dylan Hatch (2):
arm64: patching: Rename aarch64_insn_copy to text_poke.
arm64/module: Use text-poke API for late relocations.
arch/arm64/include/asm/text-patching.h | 2 +-
arch/arm64/kernel/module.c | 129 ++++++++++++++++---------
arch/arm64/kernel/patching.c | 12 +--
arch/arm64/net/bpf_jit_comp.c | 2 +-
4 files changed, 91 insertions(+), 54 deletions(-)
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 1/2] arm64: patching: Rename aarch64_insn_copy to text_poke.
2025-04-12 1:09 [PATCH v2 0/2] arm64/module: Enable late module relocations Dylan Hatch
@ 2025-04-12 1:09 ` Dylan Hatch
2025-04-21 22:14 ` Dylan Hatch
2025-04-22 0:32 ` Song Liu
2025-04-12 1:09 ` [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations Dylan Hatch
1 sibling, 2 replies; 15+ messages in thread
From: Dylan Hatch @ 2025-04-12 1:09 UTC (permalink / raw)
To: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas, Will Deacon,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Song Liu, Ard Biesheuvel,
Mark Rutland
Cc: linux-arm-kernel, linux-kernel, Dylan Hatch, Roman Gushchin
Match the name and signature of the equivalent in the x86 text-poke API.
Making the src pointer const also allows this function to be
interchangeable with memcpy().
Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
---
arch/arm64/include/asm/text-patching.h | 2 +-
arch/arm64/kernel/patching.c | 12 ++++++------
arch/arm64/net/bpf_jit_comp.c | 2 +-
3 files changed, 8 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/text-patching.h b/arch/arm64/include/asm/text-patching.h
index 587bdb91ab7a6..450d806d11109 100644
--- a/arch/arm64/include/asm/text-patching.h
+++ b/arch/arm64/include/asm/text-patching.h
@@ -9,7 +9,7 @@ int aarch64_insn_write(void *addr, u32 insn);
int aarch64_insn_write_literal_u64(void *addr, u64 val);
void *aarch64_insn_set(void *dst, u32 insn, size_t len);
-void *aarch64_insn_copy(void *dst, void *src, size_t len);
+void *text_poke(void *dst, const void *src, size_t len);
int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c
index 1041bc67a3eee..e07dc32620053 100644
--- a/arch/arm64/kernel/patching.c
+++ b/arch/arm64/kernel/patching.c
@@ -102,9 +102,9 @@ noinstr int aarch64_insn_write_literal_u64(void *addr, u64 val)
return ret;
}
-typedef void text_poke_f(void *dst, void *src, size_t patched, size_t len);
+typedef void text_poke_f(void *dst, const void *src, size_t patched, size_t len);
-static void *__text_poke(text_poke_f func, void *addr, void *src, size_t len)
+static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t len)
{
unsigned long flags;
size_t patched = 0;
@@ -132,12 +132,12 @@ static void *__text_poke(text_poke_f func, void *addr, void *src, size_t len)
return addr;
}
-static void text_poke_memcpy(void *dst, void *src, size_t patched, size_t len)
+static void text_poke_memcpy(void *dst, const void *src, size_t patched, size_t len)
{
copy_to_kernel_nofault(dst, src + patched, len);
}
-static void text_poke_memset(void *dst, void *src, size_t patched, size_t len)
+static void text_poke_memset(void *dst, const void *src, size_t patched, size_t len)
{
u32 c = *(u32 *)src;
@@ -145,14 +145,14 @@ static void text_poke_memset(void *dst, void *src, size_t patched, size_t len)
}
/**
- * aarch64_insn_copy - Copy instructions into (an unused part of) RX memory
+ * text_poke - Copy instructions into (an unused part of) RX memory
* @dst: address to modify
* @src: source of the copy
* @len: length to copy
*
* Useful for JITs to dump new code blocks into unused regions of RX memory.
*/
-noinstr void *aarch64_insn_copy(void *dst, void *src, size_t len)
+noinstr void *text_poke(void *dst, const void *src, size_t len)
{
/* A64 instructions must be word aligned */
if ((uintptr_t)dst & 0x3)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 70d7c89d3ac90..b5be90edff410 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -2047,7 +2047,7 @@ bool bpf_jit_supports_kfunc_call(void)
void *bpf_arch_text_copy(void *dst, void *src, size_t len)
{
- if (!aarch64_insn_copy(dst, src, len))
+ if (!text_poke(dst, src, len))
return ERR_PTR(-EINVAL);
return dst;
}
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations.
2025-04-12 1:09 [PATCH v2 0/2] arm64/module: Enable late module relocations Dylan Hatch
2025-04-12 1:09 ` [PATCH v2 1/2] arm64: patching: Rename aarch64_insn_copy to text_poke Dylan Hatch
@ 2025-04-12 1:09 ` Dylan Hatch
2025-04-22 0:35 ` Song Liu
` (3 more replies)
1 sibling, 4 replies; 15+ messages in thread
From: Dylan Hatch @ 2025-04-12 1:09 UTC (permalink / raw)
To: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas, Will Deacon,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Song Liu, Ard Biesheuvel,
Mark Rutland
Cc: linux-arm-kernel, linux-kernel, Dylan Hatch, Roman Gushchin
To enable late module patching, livepatch modules need to be able to
apply some of their relocations well after being loaded. In this
scenario, use the text-poking API to allow this, even with
STRICT_MODULE_RWX.
This patch is largely based off commit 88fc078a7a8f6 ("x86/module: Use
text_poke() for late relocations").
Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
---
arch/arm64/kernel/module.c | 129 ++++++++++++++++++++++++-------------
1 file changed, 83 insertions(+), 46 deletions(-)
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 06bb680bfe975..0703502d9dc37 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -18,11 +18,13 @@
#include <linux/moduleloader.h>
#include <linux/random.h>
#include <linux/scs.h>
+#include <linux/memory.h>
#include <asm/alternative.h>
#include <asm/insn.h>
#include <asm/scs.h>
#include <asm/sections.h>
+#include <asm/text-patching.h>
enum aarch64_reloc_op {
RELOC_OP_NONE,
@@ -48,7 +50,8 @@ static u64 do_reloc(enum aarch64_reloc_op reloc_op, __le32 *place, u64 val)
return 0;
}
-static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
+static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len,
+ void *(*write)(void *dest, const void *src, size_t len))
{
s64 sval = do_reloc(op, place, val);
@@ -66,7 +69,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
switch (len) {
case 16:
- *(s16 *)place = sval;
+ write(place, &sval, sizeof(s16));
switch (op) {
case RELOC_OP_ABS:
if (sval < 0 || sval > U16_MAX)
@@ -82,7 +85,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
}
break;
case 32:
- *(s32 *)place = sval;
+ write(place, &sval, sizeof(s32));
switch (op) {
case RELOC_OP_ABS:
if (sval < 0 || sval > U32_MAX)
@@ -98,7 +101,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
}
break;
case 64:
- *(s64 *)place = sval;
+ write(place, &sval, sizeof(s64));
break;
default:
pr_err("Invalid length (%d) for data relocation\n", len);
@@ -113,11 +116,13 @@ enum aarch64_insn_movw_imm_type {
};
static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
- int lsb, enum aarch64_insn_movw_imm_type imm_type)
+ int lsb, enum aarch64_insn_movw_imm_type imm_type,
+ void *(*write)(void *dest, const void *src, size_t len))
{
u64 imm;
s64 sval;
u32 insn = le32_to_cpu(*place);
+ __le32 le_insn;
sval = do_reloc(op, place, val);
imm = sval >> lsb;
@@ -145,7 +150,8 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
/* Update the instruction with the new encoding. */
insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
- *place = cpu_to_le32(insn);
+ le_insn = cpu_to_le32(insn);
+ write(place, &le_insn, sizeof(le_insn));
if (imm > U16_MAX)
return -ERANGE;
@@ -154,11 +160,13 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
}
static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
- int lsb, int len, enum aarch64_insn_imm_type imm_type)
+ int lsb, int len, enum aarch64_insn_imm_type imm_type,
+ void *(*write)(void *dest, const void *src, size_t len))
{
u64 imm, imm_mask;
s64 sval;
u32 insn = le32_to_cpu(*place);
+ __le32 le_insn;
/* Calculate the relocation value. */
sval = do_reloc(op, place, val);
@@ -170,7 +178,8 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
/* Update the instruction's immediate field. */
insn = aarch64_insn_encode_immediate(imm_type, insn, imm);
- *place = cpu_to_le32(insn);
+ le_insn = cpu_to_le32(insn);
+ write(place, &le_insn, sizeof(le_insn));
/*
* Extract the upper value bits (including the sign bit) and
@@ -189,17 +198,19 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
}
static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
- __le32 *place, u64 val)
+ __le32 *place, u64 val,
+ void *(*write)(void *dest, const void *src, size_t len))
{
u32 insn;
+ __le32 le_insn;
if (!is_forbidden_offset_for_adrp(place))
return reloc_insn_imm(RELOC_OP_PAGE, place, val, 12, 21,
- AARCH64_INSN_IMM_ADR);
+ AARCH64_INSN_IMM_ADR, write);
/* patch ADRP to ADR if it is in range */
if (!reloc_insn_imm(RELOC_OP_PREL, place, val & ~0xfff, 0, 21,
- AARCH64_INSN_IMM_ADR)) {
+ AARCH64_INSN_IMM_ADR, write)) {
insn = le32_to_cpu(*place);
insn &= ~BIT(31);
} else {
@@ -211,15 +222,17 @@ static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
AARCH64_INSN_BRANCH_NOLINK);
}
- *place = cpu_to_le32(insn);
+ le_insn = cpu_to_le32(insn);
+ write(place, &le_insn, sizeof(le_insn));
return 0;
}
-int apply_relocate_add(Elf64_Shdr *sechdrs,
+static int __apply_relocate_add(Elf64_Shdr *sechdrs,
const char *strtab,
unsigned int symindex,
unsigned int relsec,
- struct module *me)
+ struct module *me,
+ void *(*write)(void *dest, const void *src, size_t len))
{
unsigned int i;
int ovf;
@@ -255,23 +268,23 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
/* Data relocations. */
case R_AARCH64_ABS64:
overflow_check = false;
- ovf = reloc_data(RELOC_OP_ABS, loc, val, 64);
+ ovf = reloc_data(RELOC_OP_ABS, loc, val, 64, write);
break;
case R_AARCH64_ABS32:
- ovf = reloc_data(RELOC_OP_ABS, loc, val, 32);
+ ovf = reloc_data(RELOC_OP_ABS, loc, val, 32, write);
break;
case R_AARCH64_ABS16:
- ovf = reloc_data(RELOC_OP_ABS, loc, val, 16);
+ ovf = reloc_data(RELOC_OP_ABS, loc, val, 16, write);
break;
case R_AARCH64_PREL64:
overflow_check = false;
- ovf = reloc_data(RELOC_OP_PREL, loc, val, 64);
+ ovf = reloc_data(RELOC_OP_PREL, loc, val, 64, write);
break;
case R_AARCH64_PREL32:
- ovf = reloc_data(RELOC_OP_PREL, loc, val, 32);
+ ovf = reloc_data(RELOC_OP_PREL, loc, val, 32, write);
break;
case R_AARCH64_PREL16:
- ovf = reloc_data(RELOC_OP_PREL, loc, val, 16);
+ ovf = reloc_data(RELOC_OP_PREL, loc, val, 16, write);
break;
/* MOVW instruction relocations. */
@@ -280,88 +293,88 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
fallthrough;
case R_AARCH64_MOVW_UABS_G0:
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
- AARCH64_INSN_IMM_MOVKZ);
+ AARCH64_INSN_IMM_MOVKZ, write);
break;
case R_AARCH64_MOVW_UABS_G1_NC:
overflow_check = false;
fallthrough;
case R_AARCH64_MOVW_UABS_G1:
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
- AARCH64_INSN_IMM_MOVKZ);
+ AARCH64_INSN_IMM_MOVKZ, write);
break;
case R_AARCH64_MOVW_UABS_G2_NC:
overflow_check = false;
fallthrough;
case R_AARCH64_MOVW_UABS_G2:
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
- AARCH64_INSN_IMM_MOVKZ);
+ AARCH64_INSN_IMM_MOVKZ, write);
break;
case R_AARCH64_MOVW_UABS_G3:
/* We're using the top bits so we can't overflow. */
overflow_check = false;
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 48,
- AARCH64_INSN_IMM_MOVKZ);
+ AARCH64_INSN_IMM_MOVKZ, write);
break;
case R_AARCH64_MOVW_SABS_G0:
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
- AARCH64_INSN_IMM_MOVNZ);
+ AARCH64_INSN_IMM_MOVNZ, write);
break;
case R_AARCH64_MOVW_SABS_G1:
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
- AARCH64_INSN_IMM_MOVNZ);
+ AARCH64_INSN_IMM_MOVNZ, write);
break;
case R_AARCH64_MOVW_SABS_G2:
ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
- AARCH64_INSN_IMM_MOVNZ);
+ AARCH64_INSN_IMM_MOVNZ, write);
break;
case R_AARCH64_MOVW_PREL_G0_NC:
overflow_check = false;
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
- AARCH64_INSN_IMM_MOVKZ);
+ AARCH64_INSN_IMM_MOVKZ, write);
break;
case R_AARCH64_MOVW_PREL_G0:
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
- AARCH64_INSN_IMM_MOVNZ);
+ AARCH64_INSN_IMM_MOVNZ, write);
break;
case R_AARCH64_MOVW_PREL_G1_NC:
overflow_check = false;
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
- AARCH64_INSN_IMM_MOVKZ);
+ AARCH64_INSN_IMM_MOVKZ, write);
break;
case R_AARCH64_MOVW_PREL_G1:
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
- AARCH64_INSN_IMM_MOVNZ);
+ AARCH64_INSN_IMM_MOVNZ, write);
break;
case R_AARCH64_MOVW_PREL_G2_NC:
overflow_check = false;
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
- AARCH64_INSN_IMM_MOVKZ);
+ AARCH64_INSN_IMM_MOVKZ, write);
break;
case R_AARCH64_MOVW_PREL_G2:
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
- AARCH64_INSN_IMM_MOVNZ);
+ AARCH64_INSN_IMM_MOVNZ, write);
break;
case R_AARCH64_MOVW_PREL_G3:
/* We're using the top bits so we can't overflow. */
overflow_check = false;
ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 48,
- AARCH64_INSN_IMM_MOVNZ);
+ AARCH64_INSN_IMM_MOVNZ, write);
break;
/* Immediate instruction relocations. */
case R_AARCH64_LD_PREL_LO19:
ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
- AARCH64_INSN_IMM_19);
+ AARCH64_INSN_IMM_19, write);
break;
case R_AARCH64_ADR_PREL_LO21:
ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 0, 21,
- AARCH64_INSN_IMM_ADR);
+ AARCH64_INSN_IMM_ADR, write);
break;
case R_AARCH64_ADR_PREL_PG_HI21_NC:
overflow_check = false;
fallthrough;
case R_AARCH64_ADR_PREL_PG_HI21:
- ovf = reloc_insn_adrp(me, sechdrs, loc, val);
+ ovf = reloc_insn_adrp(me, sechdrs, loc, val, write);
if (ovf && ovf != -ERANGE)
return ovf;
break;
@@ -369,46 +382,46 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
case R_AARCH64_LDST8_ABS_LO12_NC:
overflow_check = false;
ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 0, 12,
- AARCH64_INSN_IMM_12);
+ AARCH64_INSN_IMM_12, write);
break;
case R_AARCH64_LDST16_ABS_LO12_NC:
overflow_check = false;
ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 1, 11,
- AARCH64_INSN_IMM_12);
+ AARCH64_INSN_IMM_12, write);
break;
case R_AARCH64_LDST32_ABS_LO12_NC:
overflow_check = false;
ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 2, 10,
- AARCH64_INSN_IMM_12);
+ AARCH64_INSN_IMM_12, write);
break;
case R_AARCH64_LDST64_ABS_LO12_NC:
overflow_check = false;
ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 3, 9,
- AARCH64_INSN_IMM_12);
+ AARCH64_INSN_IMM_12, write);
break;
case R_AARCH64_LDST128_ABS_LO12_NC:
overflow_check = false;
ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 4, 8,
- AARCH64_INSN_IMM_12);
+ AARCH64_INSN_IMM_12, write);
break;
case R_AARCH64_TSTBR14:
ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 14,
- AARCH64_INSN_IMM_14);
+ AARCH64_INSN_IMM_14, write);
break;
case R_AARCH64_CONDBR19:
ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
- AARCH64_INSN_IMM_19);
+ AARCH64_INSN_IMM_19, write);
break;
case R_AARCH64_JUMP26:
case R_AARCH64_CALL26:
ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
- AARCH64_INSN_IMM_26);
+ AARCH64_INSN_IMM_26, write);
if (ovf == -ERANGE) {
val = module_emit_plt_entry(me, sechdrs, loc, &rel[i], sym);
if (!val)
return -ENOEXEC;
ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
- 26, AARCH64_INSN_IMM_26);
+ 26, AARCH64_INSN_IMM_26, write);
}
break;
@@ -431,6 +444,30 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
return -ENOEXEC;
}
+int apply_relocate_add(Elf64_Shdr *sechdrs,
+ const char *strtab,
+ unsigned int symindex,
+ unsigned int relsec,
+ struct module *me)
+{
+ int ret;
+ bool early = me->state == MODULE_STATE_UNFORMED;
+ void *(*write)(void *, const void *, size_t) = memcpy;
+
+ if (!early) {
+ write = text_poke;
+ mutex_lock(&text_mutex);
+ }
+
+ ret = __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
+ write);
+
+ if (!early)
+ mutex_unlock(&text_mutex);
+
+ return ret;
+}
+
static inline void __init_plt(struct plt_entry *plt, unsigned long addr)
{
*plt = get_plt_entry(addr, plt);
--
2.49.0.604.gff1f9ca942-goog
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] arm64: patching: Rename aarch64_insn_copy to text_poke.
2025-04-12 1:09 ` [PATCH v2 1/2] arm64: patching: Rename aarch64_insn_copy to text_poke Dylan Hatch
@ 2025-04-21 22:14 ` Dylan Hatch
2025-04-22 0:32 ` Song Liu
1 sibling, 0 replies; 15+ messages in thread
From: Dylan Hatch @ 2025-04-21 22:14 UTC (permalink / raw)
To: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas, Will Deacon,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Song Liu, Ard Biesheuvel,
Mark Rutland
Cc: linux-arm-kernel, linux-kernel, Roman Gushchin
On Fri, Apr 11, 2025 at 6:10 PM Dylan Hatch <dylanbhatch@google.com> wrote:
>
> Match the name and signature of the equivalent in the x86 text-poke API.
> Making the src pointer const also allows this function to be
> interchangeable with memcpy().
>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> ---
> arch/arm64/include/asm/text-patching.h | 2 +-
> arch/arm64/kernel/patching.c | 12 ++++++------
> arch/arm64/net/bpf_jit_comp.c | 2 +-
> 3 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm64/include/asm/text-patching.h b/arch/arm64/include/asm/text-patching.h
> index 587bdb91ab7a6..450d806d11109 100644
> --- a/arch/arm64/include/asm/text-patching.h
> +++ b/arch/arm64/include/asm/text-patching.h
> @@ -9,7 +9,7 @@ int aarch64_insn_write(void *addr, u32 insn);
>
> int aarch64_insn_write_literal_u64(void *addr, u64 val);
> void *aarch64_insn_set(void *dst, u32 insn, size_t len);
> -void *aarch64_insn_copy(void *dst, void *src, size_t len);
> +void *text_poke(void *dst, const void *src, size_t len);
>
> int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
> int aarch64_insn_patch_text(void *addrs[], u32 insns[], int cnt);
> diff --git a/arch/arm64/kernel/patching.c b/arch/arm64/kernel/patching.c
> index 1041bc67a3eee..e07dc32620053 100644
> --- a/arch/arm64/kernel/patching.c
> +++ b/arch/arm64/kernel/patching.c
> @@ -102,9 +102,9 @@ noinstr int aarch64_insn_write_literal_u64(void *addr, u64 val)
> return ret;
> }
>
> -typedef void text_poke_f(void *dst, void *src, size_t patched, size_t len);
> +typedef void text_poke_f(void *dst, const void *src, size_t patched, size_t len);
>
> -static void *__text_poke(text_poke_f func, void *addr, void *src, size_t len)
> +static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t len)
> {
> unsigned long flags;
> size_t patched = 0;
> @@ -132,12 +132,12 @@ static void *__text_poke(text_poke_f func, void *addr, void *src, size_t len)
> return addr;
> }
>
> -static void text_poke_memcpy(void *dst, void *src, size_t patched, size_t len)
> +static void text_poke_memcpy(void *dst, const void *src, size_t patched, size_t len)
> {
> copy_to_kernel_nofault(dst, src + patched, len);
> }
>
> -static void text_poke_memset(void *dst, void *src, size_t patched, size_t len)
> +static void text_poke_memset(void *dst, const void *src, size_t patched, size_t len)
> {
> u32 c = *(u32 *)src;
>
> @@ -145,14 +145,14 @@ static void text_poke_memset(void *dst, void *src, size_t patched, size_t len)
> }
>
> /**
> - * aarch64_insn_copy - Copy instructions into (an unused part of) RX memory
> + * text_poke - Copy instructions into (an unused part of) RX memory
> * @dst: address to modify
> * @src: source of the copy
> * @len: length to copy
> *
> * Useful for JITs to dump new code blocks into unused regions of RX memory.
> */
> -noinstr void *aarch64_insn_copy(void *dst, void *src, size_t len)
> +noinstr void *text_poke(void *dst, const void *src, size_t len)
> {
> /* A64 instructions must be word aligned */
> if ((uintptr_t)dst & 0x3)
> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
> index 70d7c89d3ac90..b5be90edff410 100644
> --- a/arch/arm64/net/bpf_jit_comp.c
> +++ b/arch/arm64/net/bpf_jit_comp.c
> @@ -2047,7 +2047,7 @@ bool bpf_jit_supports_kfunc_call(void)
>
> void *bpf_arch_text_copy(void *dst, void *src, size_t len)
> {
> - if (!aarch64_insn_copy(dst, src, len))
> + if (!text_poke(dst, src, len))
> return ERR_PTR(-EINVAL);
> return dst;
> }
> --
> 2.49.0.604.gff1f9ca942-goog
>
Friendly ping.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] arm64: patching: Rename aarch64_insn_copy to text_poke.
2025-04-12 1:09 ` [PATCH v2 1/2] arm64: patching: Rename aarch64_insn_copy to text_poke Dylan Hatch
2025-04-21 22:14 ` Dylan Hatch
@ 2025-04-22 0:32 ` Song Liu
1 sibling, 0 replies; 15+ messages in thread
From: Song Liu @ 2025-04-22 0:32 UTC (permalink / raw)
To: Dylan Hatch
Cc: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas, Will Deacon,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Ard Biesheuvel, Mark Rutland,
linux-arm-kernel, linux-kernel, Roman Gushchin
On Fri, Apr 11, 2025 at 6:10 PM Dylan Hatch <dylanbhatch@google.com> wrote:
>
> Match the name and signature of the equivalent in the x86 text-poke API.
> Making the src pointer const also allows this function to be
> interchangeable with memcpy().
>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
Acked-by: Song Liu <song@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations.
2025-04-12 1:09 ` [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations Dylan Hatch
@ 2025-04-22 0:35 ` Song Liu
2025-04-22 6:25 ` Song Liu
2025-04-23 6:04 ` Toshiyuki Sato (Fujitsu)
` (2 subsequent siblings)
3 siblings, 1 reply; 15+ messages in thread
From: Song Liu @ 2025-04-22 0:35 UTC (permalink / raw)
To: Dylan Hatch
Cc: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas, Will Deacon,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Ard Biesheuvel, Mark Rutland,
linux-arm-kernel, linux-kernel, Roman Gushchin
On Fri, Apr 11, 2025 at 6:10 PM Dylan Hatch <dylanbhatch@google.com> wrote:
>
> To enable late module patching, livepatch modules need to be able to
> apply some of their relocations well after being loaded. In this
> scenario, use the text-poking API to allow this, even with
> STRICT_MODULE_RWX.
>
> This patch is largely based off commit 88fc078a7a8f6 ("x86/module: Use
> text_poke() for late relocations").
>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
Acked-by: Song Liu <song@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations.
2025-04-22 0:35 ` Song Liu
@ 2025-04-22 6:25 ` Song Liu
2025-04-23 0:27 ` Dylan Hatch
0 siblings, 1 reply; 15+ messages in thread
From: Song Liu @ 2025-04-22 6:25 UTC (permalink / raw)
To: Dylan Hatch
Cc: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas, Will Deacon,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Ard Biesheuvel, Mark Rutland,
linux-arm-kernel, linux-kernel, Roman Gushchin
On Mon, Apr 21, 2025 at 5:35 PM Song Liu <song@kernel.org> wrote:
>
> On Fri, Apr 11, 2025 at 6:10 PM Dylan Hatch <dylanbhatch@google.com> wrote:
> >
> > To enable late module patching, livepatch modules need to be able to
> > apply some of their relocations well after being loaded. In this
> > scenario, use the text-poking API to allow this, even with
> > STRICT_MODULE_RWX.
> >
> > This patch is largely based off commit 88fc078a7a8f6 ("x86/module: Use
> > text_poke() for late relocations").
> >
> > Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
Could you please share how you test this?
Thanks,
Song
> Acked-by: Song Liu <song@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations.
2025-04-22 6:25 ` Song Liu
@ 2025-04-23 0:27 ` Dylan Hatch
2025-04-23 4:42 ` Song Liu
0 siblings, 1 reply; 15+ messages in thread
From: Dylan Hatch @ 2025-04-23 0:27 UTC (permalink / raw)
To: Song Liu
Cc: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas, Will Deacon,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Ard Biesheuvel, Mark Rutland,
linux-arm-kernel, linux-kernel, Roman Gushchin
On Mon, Apr 21, 2025 at 11:25 PM Song Liu <song@kernel.org> wrote:
>
> On Mon, Apr 21, 2025 at 5:35 PM Song Liu <song@kernel.org> wrote:
> >
> > On Fri, Apr 11, 2025 at 6:10 PM Dylan Hatch <dylanbhatch@google.com> wrote:
> > >
> > > To enable late module patching, livepatch modules need to be able to
> > > apply some of their relocations well after being loaded. In this
> > > scenario, use the text-poking API to allow this, even with
> > > STRICT_MODULE_RWX.
> > >
> > > This patch is largely based off commit 88fc078a7a8f6 ("x86/module: Use
> > > text_poke() for late relocations").
> > >
> > > Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
>
> Could you please share how you test this?
>
For context, we enable livepatch for arm64 by porting this RFC series
(along with other internal patches) into our kernel:
https://lore.kernel.org/all/20230202074036.507249-1-madvenka@linux.microsoft.com/.
The way I tested this patch is: with STRICT_MODULE_RWX, load a module
and a livepatch that touches that module (in either order), and
confirm the kernel doesn't crash.
Without this patch, a crash is caused in apply_relocate_add() if both
a module and a livepatch that touches the module are both loaded. This
happens through one of two code paths:
1. If the module is already loaded when the livepatch is applied,
through the module_init() callback.
2. If the module is loaded after the livepatch is applied, through
prepare_coming_module().
In both scenarios, the livepatch module's text is already RX-only.
Thanks,
Dylan
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations.
2025-04-23 0:27 ` Dylan Hatch
@ 2025-04-23 4:42 ` Song Liu
0 siblings, 0 replies; 15+ messages in thread
From: Song Liu @ 2025-04-23 4:42 UTC (permalink / raw)
To: Dylan Hatch
Cc: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas, Will Deacon,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Ard Biesheuvel, Mark Rutland,
linux-arm-kernel, linux-kernel, Roman Gushchin
On Tue, Apr 22, 2025 at 5:27 PM Dylan Hatch <dylanbhatch@google.com> wrote:
>
> On Mon, Apr 21, 2025 at 11:25 PM Song Liu <song@kernel.org> wrote:
> >
> > On Mon, Apr 21, 2025 at 5:35 PM Song Liu <song@kernel.org> wrote:
> > >
> > > On Fri, Apr 11, 2025 at 6:10 PM Dylan Hatch <dylanbhatch@google.com> wrote:
> > > >
> > > > To enable late module patching, livepatch modules need to be able to
> > > > apply some of their relocations well after being loaded. In this
> > > > scenario, use the text-poking API to allow this, even with
> > > > STRICT_MODULE_RWX.
> > > >
> > > > This patch is largely based off commit 88fc078a7a8f6 ("x86/module: Use
> > > > text_poke() for late relocations").
> > > >
> > > > Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> >
> > Could you please share how you test this?
> >
>
> For context, we enable livepatch for arm64 by porting this RFC series
> (along with other internal patches) into our kernel:
> https://lore.kernel.org/all/20230202074036.507249-1-madvenka@linux.microsoft.com/.
>
> The way I tested this patch is: with STRICT_MODULE_RWX, load a module
> and a livepatch that touches that module (in either order), and
> confirm the kernel doesn't crash.
>
> Without this patch, a crash is caused in apply_relocate_add() if both
> a module and a livepatch that touches the module are both loaded. This
> happens through one of two code paths:
>
> 1. If the module is already loaded when the livepatch is applied,
> through the module_init() callback.
> 2. If the module is loaded after the livepatch is applied, through
> prepare_coming_module().
>
> In both scenarios, the livepatch module's text is already RX-only.
Thanks for sharing the information!
Could you please help test this set with [1]? This is a different approach
than the one by Madhavan. We were hoping to ship it soon.
Thanks,
Song
[1] https://lore.kernel.org/live-patching/20250320171559.3423224-1-song@kernel.org/
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations.
2025-04-12 1:09 ` [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations Dylan Hatch
2025-04-22 0:35 ` Song Liu
@ 2025-04-23 6:04 ` Toshiyuki Sato (Fujitsu)
2025-04-23 22:59 ` Dylan Hatch
2025-04-29 18:26 ` Dylan Hatch
2025-05-09 16:15 ` Will Deacon
3 siblings, 1 reply; 15+ messages in thread
From: Toshiyuki Sato (Fujitsu) @ 2025-04-23 6:04 UTC (permalink / raw)
To: 'Dylan Hatch'
Cc: 'Daniel Borkmann', 'Andrii Nakryiko',
'Martin KaFai Lau', 'Eduard Zingerman',
'Yonghong Song', 'John Fastabend',
'KP Singh', 'Stanislav Fomichev',
'Hao Luo', 'Jiri Olsa', 'Puranjay Mohan',
'Xu Kuohai', 'Philippe Mathieu-Daudé',
'Catalin Marinas', 'Will Deacon',
'Mike Rapoport (Microsoft)', 'Arnd Bergmann',
'Geert Uytterhoeven', 'Luis Chamberlain',
'Andrew Morton', 'Song Liu',
'Ard Biesheuvel', 'Mark Rutland',
'linux-arm-kernel@lists.infradead.org',
'linux-kernel@vger.kernel.org', 'Roman Gushchin',
Toshiyuki Sato (Fujitsu)
Hi Dylan,
> To enable late module patching, livepatch modules need to be able to
> apply some of their relocations well after being loaded. In this
> scenario, use the text-poking API to allow this, even with
> STRICT_MODULE_RWX.
>
> This patch is largely based off commit 88fc078a7a8f6 ("x86/module: Use
> text_poke() for late relocations").
>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> ---
> arch/arm64/kernel/module.c | 129 ++++++++++++++++++++++++-------------
> 1 file changed, 83 insertions(+), 46 deletions(-)
>
> diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> index 06bb680bfe975..0703502d9dc37 100644
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -18,11 +18,13 @@
> #include <linux/moduleloader.h>
> #include <linux/random.h>
> #include <linux/scs.h>
> +#include <linux/memory.h>
>
> #include <asm/alternative.h>
> #include <asm/insn.h>
> #include <asm/scs.h>
> #include <asm/sections.h>
> +#include <asm/text-patching.h>
>
> enum aarch64_reloc_op {
> RELOC_OP_NONE,
> @@ -48,7 +50,8 @@ static u64 do_reloc(enum aarch64_reloc_op reloc_op, __le32 *place, u64 val)
> return 0;
> }
>
> -static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> +static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
> s64 sval = do_reloc(op, place, val);
>
> @@ -66,7 +69,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
>
> switch (len) {
> case 16:
> - *(s16 *)place = sval;
> + write(place, &sval, sizeof(s16));
> switch (op) {
> case RELOC_OP_ABS:
> if (sval < 0 || sval > U16_MAX)
> @@ -82,7 +85,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> }
> break;
> case 32:
> - *(s32 *)place = sval;
> + write(place, &sval, sizeof(s32));
> switch (op) {
> case RELOC_OP_ABS:
> if (sval < 0 || sval > U32_MAX)
> @@ -98,7 +101,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> }
> break;
> case 64:
> - *(s64 *)place = sval;
> + write(place, &sval, sizeof(s64));
> break;
> default:
> pr_err("Invalid length (%d) for data relocation\n", len);
> @@ -113,11 +116,13 @@ enum aarch64_insn_movw_imm_type {
> };
>
> static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
> - int lsb, enum aarch64_insn_movw_imm_type imm_type)
> + int lsb, enum aarch64_insn_movw_imm_type imm_type,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
> u64 imm;
> s64 sval;
> u32 insn = le32_to_cpu(*place);
> + __le32 le_insn;
>
> sval = do_reloc(op, place, val);
> imm = sval >> lsb;
> @@ -145,7 +150,8 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
>
> /* Update the instruction with the new encoding. */
> insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
> - *place = cpu_to_le32(insn);
> + le_insn = cpu_to_le32(insn);
> + write(place, &le_insn, sizeof(le_insn));
>
> if (imm > U16_MAX)
> return -ERANGE;
> @@ -154,11 +160,13 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
> }
>
> static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
> - int lsb, int len, enum aarch64_insn_imm_type imm_type)
> + int lsb, int len, enum aarch64_insn_imm_type imm_type,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
> u64 imm, imm_mask;
> s64 sval;
> u32 insn = le32_to_cpu(*place);
> + __le32 le_insn;
>
> /* Calculate the relocation value. */
> sval = do_reloc(op, place, val);
> @@ -170,7 +178,8 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
>
> /* Update the instruction's immediate field. */
> insn = aarch64_insn_encode_immediate(imm_type, insn, imm);
> - *place = cpu_to_le32(insn);
> + le_insn = cpu_to_le32(insn);
> + write(place, &le_insn, sizeof(le_insn));
>
> /*
> * Extract the upper value bits (including the sign bit) and
> @@ -189,17 +198,19 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
> }
>
> static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> - __le32 *place, u64 val)
> + __le32 *place, u64 val,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
> u32 insn;
> + __le32 le_insn;
>
> if (!is_forbidden_offset_for_adrp(place))
> return reloc_insn_imm(RELOC_OP_PAGE, place, val, 12, 21,
> - AARCH64_INSN_IMM_ADR);
> + AARCH64_INSN_IMM_ADR, write);
>
> /* patch ADRP to ADR if it is in range */
> if (!reloc_insn_imm(RELOC_OP_PREL, place, val & ~0xfff, 0, 21,
> - AARCH64_INSN_IMM_ADR)) {
> + AARCH64_INSN_IMM_ADR, write)) {
> insn = le32_to_cpu(*place);
> insn &= ~BIT(31);
> } else {
> @@ -211,15 +222,17 @@ static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> AARCH64_INSN_BRANCH_NOLINK);
> }
>
> - *place = cpu_to_le32(insn);
> + le_insn = cpu_to_le32(insn);
> + write(place, &le_insn, sizeof(le_insn));
> return 0;
> }
>
> -int apply_relocate_add(Elf64_Shdr *sechdrs,
> +static int __apply_relocate_add(Elf64_Shdr *sechdrs,
> const char *strtab,
> unsigned int symindex,
> unsigned int relsec,
> - struct module *me)
> + struct module *me,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
> unsigned int i;
> int ovf;
> @@ -255,23 +268,23 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> /* Data relocations. */
> case R_AARCH64_ABS64:
> overflow_check = false;
> - ovf = reloc_data(RELOC_OP_ABS, loc, val, 64);
> + ovf = reloc_data(RELOC_OP_ABS, loc, val, 64, write);
> break;
> case R_AARCH64_ABS32:
> - ovf = reloc_data(RELOC_OP_ABS, loc, val, 32);
> + ovf = reloc_data(RELOC_OP_ABS, loc, val, 32, write);
> break;
> case R_AARCH64_ABS16:
> - ovf = reloc_data(RELOC_OP_ABS, loc, val, 16);
> + ovf = reloc_data(RELOC_OP_ABS, loc, val, 16, write);
> break;
> case R_AARCH64_PREL64:
> overflow_check = false;
> - ovf = reloc_data(RELOC_OP_PREL, loc, val, 64);
> + ovf = reloc_data(RELOC_OP_PREL, loc, val, 64, write);
> break;
> case R_AARCH64_PREL32:
> - ovf = reloc_data(RELOC_OP_PREL, loc, val, 32);
> + ovf = reloc_data(RELOC_OP_PREL, loc, val, 32, write);
> break;
> case R_AARCH64_PREL16:
> - ovf = reloc_data(RELOC_OP_PREL, loc, val, 16);
> + ovf = reloc_data(RELOC_OP_PREL, loc, val, 16, write);
> break;
>
> /* MOVW instruction relocations. */
> @@ -280,88 +293,88 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> fallthrough;
> case R_AARCH64_MOVW_UABS_G0:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_UABS_G1_NC:
> overflow_check = false;
> fallthrough;
> case R_AARCH64_MOVW_UABS_G1:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_UABS_G2_NC:
> overflow_check = false;
> fallthrough;
> case R_AARCH64_MOVW_UABS_G2:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_UABS_G3:
> /* We're using the top bits so we can't overflow. */
> overflow_check = false;
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 48,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_SABS_G0:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_SABS_G1:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_SABS_G2:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G0_NC:
> overflow_check = false;
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G0:
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G1_NC:
> overflow_check = false;
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G1:
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G2_NC:
> overflow_check = false;
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G2:
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G3:
> /* We're using the top bits so we can't overflow. */
> overflow_check = false;
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 48,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
>
> /* Immediate instruction relocations. */
> case R_AARCH64_LD_PREL_LO19:
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
> - AARCH64_INSN_IMM_19);
> + AARCH64_INSN_IMM_19, write);
> break;
> case R_AARCH64_ADR_PREL_LO21:
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 0, 21,
> - AARCH64_INSN_IMM_ADR);
> + AARCH64_INSN_IMM_ADR, write);
> break;
> case R_AARCH64_ADR_PREL_PG_HI21_NC:
> overflow_check = false;
> fallthrough;
> case R_AARCH64_ADR_PREL_PG_HI21:
> - ovf = reloc_insn_adrp(me, sechdrs, loc, val);
> + ovf = reloc_insn_adrp(me, sechdrs, loc, val, write);
> if (ovf && ovf != -ERANGE)
> return ovf;
> break;
> @@ -369,46 +382,46 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> case R_AARCH64_LDST8_ABS_LO12_NC:
> overflow_check = false;
> ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 0, 12,
> - AARCH64_INSN_IMM_12);
> + AARCH64_INSN_IMM_12, write);
> break;
> case R_AARCH64_LDST16_ABS_LO12_NC:
> overflow_check = false;
> ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 1, 11,
> - AARCH64_INSN_IMM_12);
> + AARCH64_INSN_IMM_12, write);
> break;
> case R_AARCH64_LDST32_ABS_LO12_NC:
> overflow_check = false;
> ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 2, 10,
> - AARCH64_INSN_IMM_12);
> + AARCH64_INSN_IMM_12, write);
> break;
> case R_AARCH64_LDST64_ABS_LO12_NC:
> overflow_check = false;
> ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 3, 9,
> - AARCH64_INSN_IMM_12);
> + AARCH64_INSN_IMM_12, write);
> break;
> case R_AARCH64_LDST128_ABS_LO12_NC:
> overflow_check = false;
> ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 4, 8,
> - AARCH64_INSN_IMM_12);
> + AARCH64_INSN_IMM_12, write);
> break;
> case R_AARCH64_TSTBR14:
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 14,
> - AARCH64_INSN_IMM_14);
> + AARCH64_INSN_IMM_14, write);
> break;
> case R_AARCH64_CONDBR19:
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
> - AARCH64_INSN_IMM_19);
> + AARCH64_INSN_IMM_19, write);
> break;
> case R_AARCH64_JUMP26:
> case R_AARCH64_CALL26:
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
> - AARCH64_INSN_IMM_26);
> + AARCH64_INSN_IMM_26, write);
> if (ovf == -ERANGE) {
> val = module_emit_plt_entry(me, sechdrs, loc, &rel[i], sym);
> if (!val)
> return -ENOEXEC;
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
> - 26, AARCH64_INSN_IMM_26);
> + 26, AARCH64_INSN_IMM_26, write);
> }
> break;
>
> @@ -431,6 +444,30 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> return -ENOEXEC;
> }
>
> +int apply_relocate_add(Elf64_Shdr *sechdrs,
> + const char *strtab,
> + unsigned int symindex,
> + unsigned int relsec,
> + struct module *me)
> +{
> + int ret;
> + bool early = me->state == MODULE_STATE_UNFORMED;
> + void *(*write)(void *, const void *, size_t) = memcpy;
> +
> + if (!early) {
> + write = text_poke;
> + mutex_lock(&text_mutex);
> + }
> +
> + ret = __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
> + write);
> +
> + if (!early)
> + mutex_unlock(&text_mutex);
> +
> + return ret;
> +}
> +
> static inline void __init_plt(struct plt_entry *plt, unsigned long addr)
> {
> *plt = get_plt_entry(addr, plt);
> --
> 2.49.0.604.gff1f9ca942-goog
Thanks for posting the patch.
We are testing this patch using two types of kernels.
Both kernels are based on version 6.13, one is patched to use the sframe unwinder
in livepatch [1], the other is patched to not use sframe [2].
[1] https://lore.kernel.org/all/20250127213310.2496133-1-wnliu@google.com/
[2] https://lore.kernel.org/linux-arm-kernel/20250320171559.3423224-1-song@kernel.org/
For testing, we used a kpatch for arm64 [3] and the package's integration tests.
In an environment where only the livepatch patch was applied,
"module-LOADED.test" failed, but after applying this patch, it passed.
Here are the comments on the test results:[4]
[3] https://github.com/dynup/kpatch/pull/1439
[4] https://github.com/dynup/kpatch/pull/1439#issuecomment-2781731440
I just want to confirm one thing.
I think the issues this patch solves are the same as those in the previously
posted patch [5].
Am I correct in understanding that this is an improved version?
Furthermore, this patch also supports rewriting data relocation sections,
which the previously posted patch did not support.
[5] https://lore.kernel.org/linux-arm-kernel/20211103210709.31790-1-surajjs@amazon.com/
Tested-by: Toshiyuki Sato <fj6611ie@aa.jp.fujitsu.com>
Regards,
Toshiyuki Sato
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations.
2025-04-23 6:04 ` Toshiyuki Sato (Fujitsu)
@ 2025-04-23 22:59 ` Dylan Hatch
0 siblings, 0 replies; 15+ messages in thread
From: Dylan Hatch @ 2025-04-23 22:59 UTC (permalink / raw)
To: Toshiyuki Sato (Fujitsu)
Cc: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas, Will Deacon,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Song Liu, Ard Biesheuvel,
Mark Rutland, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, Roman Gushchin
Hi Toshiyuki,
On Tue, Apr 22, 2025 at 11:05 PM Toshiyuki Sato (Fujitsu)
<fj6611ie@fujitsu.com> wrote:
>
> Hi Dylan,
>
> > To enable late module patching, livepatch modules need to be able to
> > apply some of their relocations well after being loaded. In this
> > scenario, use the text-poking API to allow this, even with
> > STRICT_MODULE_RWX.
> >
> > This patch is largely based off commit 88fc078a7a8f6 ("x86/module: Use
> > text_poke() for late relocations").
> >
> > Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> > ---
> > arch/arm64/kernel/module.c | 129 ++++++++++++++++++++++++-------------
> > 1 file changed, 83 insertions(+), 46 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> > index 06bb680bfe975..0703502d9dc37 100644
> > --- a/arch/arm64/kernel/module.c
> > +++ b/arch/arm64/kernel/module.c
> > @@ -18,11 +18,13 @@
> > #include <linux/moduleloader.h>
> > #include <linux/random.h>
> > #include <linux/scs.h>
> > +#include <linux/memory.h>
> >
> > #include <asm/alternative.h>
> > #include <asm/insn.h>
> > #include <asm/scs.h>
> > #include <asm/sections.h>
> > +#include <asm/text-patching.h>
> >
> > enum aarch64_reloc_op {
> > RELOC_OP_NONE,
> > @@ -48,7 +50,8 @@ static u64 do_reloc(enum aarch64_reloc_op reloc_op, __le32 *place, u64 val)
> > return 0;
> > }
> >
> > -static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> > +static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len,
> > + void *(*write)(void *dest, const void *src, size_t len))
> > {
> > s64 sval = do_reloc(op, place, val);
> >
> > @@ -66,7 +69,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> >
> > switch (len) {
> > case 16:
> > - *(s16 *)place = sval;
> > + write(place, &sval, sizeof(s16));
> > switch (op) {
> > case RELOC_OP_ABS:
> > if (sval < 0 || sval > U16_MAX)
> > @@ -82,7 +85,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> > }
> > break;
> > case 32:
> > - *(s32 *)place = sval;
> > + write(place, &sval, sizeof(s32));
> > switch (op) {
> > case RELOC_OP_ABS:
> > if (sval < 0 || sval > U32_MAX)
> > @@ -98,7 +101,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> > }
> > break;
> > case 64:
> > - *(s64 *)place = sval;
> > + write(place, &sval, sizeof(s64));
> > break;
> > default:
> > pr_err("Invalid length (%d) for data relocation\n", len);
> > @@ -113,11 +116,13 @@ enum aarch64_insn_movw_imm_type {
> > };
> >
> > static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
> > - int lsb, enum aarch64_insn_movw_imm_type imm_type)
> > + int lsb, enum aarch64_insn_movw_imm_type imm_type,
> > + void *(*write)(void *dest, const void *src, size_t len))
> > {
> > u64 imm;
> > s64 sval;
> > u32 insn = le32_to_cpu(*place);
> > + __le32 le_insn;
> >
> > sval = do_reloc(op, place, val);
> > imm = sval >> lsb;
> > @@ -145,7 +150,8 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
> >
> > /* Update the instruction with the new encoding. */
> > insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
> > - *place = cpu_to_le32(insn);
> > + le_insn = cpu_to_le32(insn);
> > + write(place, &le_insn, sizeof(le_insn));
> >
> > if (imm > U16_MAX)
> > return -ERANGE;
> > @@ -154,11 +160,13 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
> > }
> >
> > static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
> > - int lsb, int len, enum aarch64_insn_imm_type imm_type)
> > + int lsb, int len, enum aarch64_insn_imm_type imm_type,
> > + void *(*write)(void *dest, const void *src, size_t len))
> > {
> > u64 imm, imm_mask;
> > s64 sval;
> > u32 insn = le32_to_cpu(*place);
> > + __le32 le_insn;
> >
> > /* Calculate the relocation value. */
> > sval = do_reloc(op, place, val);
> > @@ -170,7 +178,8 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
> >
> > /* Update the instruction's immediate field. */
> > insn = aarch64_insn_encode_immediate(imm_type, insn, imm);
> > - *place = cpu_to_le32(insn);
> > + le_insn = cpu_to_le32(insn);
> > + write(place, &le_insn, sizeof(le_insn));
> >
> > /*
> > * Extract the upper value bits (including the sign bit) and
> > @@ -189,17 +198,19 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
> > }
> >
> > static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> > - __le32 *place, u64 val)
> > + __le32 *place, u64 val,
> > + void *(*write)(void *dest, const void *src, size_t len))
> > {
> > u32 insn;
> > + __le32 le_insn;
> >
> > if (!is_forbidden_offset_for_adrp(place))
> > return reloc_insn_imm(RELOC_OP_PAGE, place, val, 12, 21,
> > - AARCH64_INSN_IMM_ADR);
> > + AARCH64_INSN_IMM_ADR, write);
> >
> > /* patch ADRP to ADR if it is in range */
> > if (!reloc_insn_imm(RELOC_OP_PREL, place, val & ~0xfff, 0, 21,
> > - AARCH64_INSN_IMM_ADR)) {
> > + AARCH64_INSN_IMM_ADR, write)) {
> > insn = le32_to_cpu(*place);
> > insn &= ~BIT(31);
> > } else {
> > @@ -211,15 +222,17 @@ static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> > AARCH64_INSN_BRANCH_NOLINK);
> > }
> >
> > - *place = cpu_to_le32(insn);
> > + le_insn = cpu_to_le32(insn);
> > + write(place, &le_insn, sizeof(le_insn));
> > return 0;
> > }
> >
> > -int apply_relocate_add(Elf64_Shdr *sechdrs,
> > +static int __apply_relocate_add(Elf64_Shdr *sechdrs,
> > const char *strtab,
> > unsigned int symindex,
> > unsigned int relsec,
> > - struct module *me)
> > + struct module *me,
> > + void *(*write)(void *dest, const void *src, size_t len))
> > {
> > unsigned int i;
> > int ovf;
> > @@ -255,23 +268,23 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> > /* Data relocations. */
> > case R_AARCH64_ABS64:
> > overflow_check = false;
> > - ovf = reloc_data(RELOC_OP_ABS, loc, val, 64);
> > + ovf = reloc_data(RELOC_OP_ABS, loc, val, 64, write);
> > break;
> > case R_AARCH64_ABS32:
> > - ovf = reloc_data(RELOC_OP_ABS, loc, val, 32);
> > + ovf = reloc_data(RELOC_OP_ABS, loc, val, 32, write);
> > break;
> > case R_AARCH64_ABS16:
> > - ovf = reloc_data(RELOC_OP_ABS, loc, val, 16);
> > + ovf = reloc_data(RELOC_OP_ABS, loc, val, 16, write);
> > break;
> > case R_AARCH64_PREL64:
> > overflow_check = false;
> > - ovf = reloc_data(RELOC_OP_PREL, loc, val, 64);
> > + ovf = reloc_data(RELOC_OP_PREL, loc, val, 64, write);
> > break;
> > case R_AARCH64_PREL32:
> > - ovf = reloc_data(RELOC_OP_PREL, loc, val, 32);
> > + ovf = reloc_data(RELOC_OP_PREL, loc, val, 32, write);
> > break;
> > case R_AARCH64_PREL16:
> > - ovf = reloc_data(RELOC_OP_PREL, loc, val, 16);
> > + ovf = reloc_data(RELOC_OP_PREL, loc, val, 16, write);
> > break;
> >
> > /* MOVW instruction relocations. */
> > @@ -280,88 +293,88 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> > fallthrough;
> > case R_AARCH64_MOVW_UABS_G0:
> > ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
> > - AARCH64_INSN_IMM_MOVKZ);
> > + AARCH64_INSN_IMM_MOVKZ, write);
> > break;
> > case R_AARCH64_MOVW_UABS_G1_NC:
> > overflow_check = false;
> > fallthrough;
> > case R_AARCH64_MOVW_UABS_G1:
> > ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
> > - AARCH64_INSN_IMM_MOVKZ);
> > + AARCH64_INSN_IMM_MOVKZ, write);
> > break;
> > case R_AARCH64_MOVW_UABS_G2_NC:
> > overflow_check = false;
> > fallthrough;
> > case R_AARCH64_MOVW_UABS_G2:
> > ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
> > - AARCH64_INSN_IMM_MOVKZ);
> > + AARCH64_INSN_IMM_MOVKZ, write);
> > break;
> > case R_AARCH64_MOVW_UABS_G3:
> > /* We're using the top bits so we can't overflow. */
> > overflow_check = false;
> > ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 48,
> > - AARCH64_INSN_IMM_MOVKZ);
> > + AARCH64_INSN_IMM_MOVKZ, write);
> > break;
> > case R_AARCH64_MOVW_SABS_G0:
> > ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
> > - AARCH64_INSN_IMM_MOVNZ);
> > + AARCH64_INSN_IMM_MOVNZ, write);
> > break;
> > case R_AARCH64_MOVW_SABS_G1:
> > ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
> > - AARCH64_INSN_IMM_MOVNZ);
> > + AARCH64_INSN_IMM_MOVNZ, write);
> > break;
> > case R_AARCH64_MOVW_SABS_G2:
> > ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
> > - AARCH64_INSN_IMM_MOVNZ);
> > + AARCH64_INSN_IMM_MOVNZ, write);
> > break;
> > case R_AARCH64_MOVW_PREL_G0_NC:
> > overflow_check = false;
> > ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
> > - AARCH64_INSN_IMM_MOVKZ);
> > + AARCH64_INSN_IMM_MOVKZ, write);
> > break;
> > case R_AARCH64_MOVW_PREL_G0:
> > ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
> > - AARCH64_INSN_IMM_MOVNZ);
> > + AARCH64_INSN_IMM_MOVNZ, write);
> > break;
> > case R_AARCH64_MOVW_PREL_G1_NC:
> > overflow_check = false;
> > ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
> > - AARCH64_INSN_IMM_MOVKZ);
> > + AARCH64_INSN_IMM_MOVKZ, write);
> > break;
> > case R_AARCH64_MOVW_PREL_G1:
> > ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
> > - AARCH64_INSN_IMM_MOVNZ);
> > + AARCH64_INSN_IMM_MOVNZ, write);
> > break;
> > case R_AARCH64_MOVW_PREL_G2_NC:
> > overflow_check = false;
> > ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
> > - AARCH64_INSN_IMM_MOVKZ);
> > + AARCH64_INSN_IMM_MOVKZ, write);
> > break;
> > case R_AARCH64_MOVW_PREL_G2:
> > ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
> > - AARCH64_INSN_IMM_MOVNZ);
> > + AARCH64_INSN_IMM_MOVNZ, write);
> > break;
> > case R_AARCH64_MOVW_PREL_G3:
> > /* We're using the top bits so we can't overflow. */
> > overflow_check = false;
> > ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 48,
> > - AARCH64_INSN_IMM_MOVNZ);
> > + AARCH64_INSN_IMM_MOVNZ, write);
> > break;
> >
> > /* Immediate instruction relocations. */
> > case R_AARCH64_LD_PREL_LO19:
> > ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
> > - AARCH64_INSN_IMM_19);
> > + AARCH64_INSN_IMM_19, write);
> > break;
> > case R_AARCH64_ADR_PREL_LO21:
> > ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 0, 21,
> > - AARCH64_INSN_IMM_ADR);
> > + AARCH64_INSN_IMM_ADR, write);
> > break;
> > case R_AARCH64_ADR_PREL_PG_HI21_NC:
> > overflow_check = false;
> > fallthrough;
> > case R_AARCH64_ADR_PREL_PG_HI21:
> > - ovf = reloc_insn_adrp(me, sechdrs, loc, val);
> > + ovf = reloc_insn_adrp(me, sechdrs, loc, val, write);
> > if (ovf && ovf != -ERANGE)
> > return ovf;
> > break;
> > @@ -369,46 +382,46 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> > case R_AARCH64_LDST8_ABS_LO12_NC:
> > overflow_check = false;
> > ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 0, 12,
> > - AARCH64_INSN_IMM_12);
> > + AARCH64_INSN_IMM_12, write);
> > break;
> > case R_AARCH64_LDST16_ABS_LO12_NC:
> > overflow_check = false;
> > ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 1, 11,
> > - AARCH64_INSN_IMM_12);
> > + AARCH64_INSN_IMM_12, write);
> > break;
> > case R_AARCH64_LDST32_ABS_LO12_NC:
> > overflow_check = false;
> > ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 2, 10,
> > - AARCH64_INSN_IMM_12);
> > + AARCH64_INSN_IMM_12, write);
> > break;
> > case R_AARCH64_LDST64_ABS_LO12_NC:
> > overflow_check = false;
> > ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 3, 9,
> > - AARCH64_INSN_IMM_12);
> > + AARCH64_INSN_IMM_12, write);
> > break;
> > case R_AARCH64_LDST128_ABS_LO12_NC:
> > overflow_check = false;
> > ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 4, 8,
> > - AARCH64_INSN_IMM_12);
> > + AARCH64_INSN_IMM_12, write);
> > break;
> > case R_AARCH64_TSTBR14:
> > ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 14,
> > - AARCH64_INSN_IMM_14);
> > + AARCH64_INSN_IMM_14, write);
> > break;
> > case R_AARCH64_CONDBR19:
> > ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
> > - AARCH64_INSN_IMM_19);
> > + AARCH64_INSN_IMM_19, write);
> > break;
> > case R_AARCH64_JUMP26:
> > case R_AARCH64_CALL26:
> > ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
> > - AARCH64_INSN_IMM_26);
> > + AARCH64_INSN_IMM_26, write);
> > if (ovf == -ERANGE) {
> > val = module_emit_plt_entry(me, sechdrs, loc, &rel[i], sym);
> > if (!val)
> > return -ENOEXEC;
> > ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
> > - 26, AARCH64_INSN_IMM_26);
> > + 26, AARCH64_INSN_IMM_26, write);
> > }
> > break;
> >
> > @@ -431,6 +444,30 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> > return -ENOEXEC;
> > }
> >
> > +int apply_relocate_add(Elf64_Shdr *sechdrs,
> > + const char *strtab,
> > + unsigned int symindex,
> > + unsigned int relsec,
> > + struct module *me)
> > +{
> > + int ret;
> > + bool early = me->state == MODULE_STATE_UNFORMED;
> > + void *(*write)(void *, const void *, size_t) = memcpy;
> > +
> > + if (!early) {
> > + write = text_poke;
> > + mutex_lock(&text_mutex);
> > + }
> > +
> > + ret = __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
> > + write);
> > +
> > + if (!early)
> > + mutex_unlock(&text_mutex);
> > +
> > + return ret;
> > +}
> > +
> > static inline void __init_plt(struct plt_entry *plt, unsigned long addr)
> > {
> > *plt = get_plt_entry(addr, plt);
> > --
> > 2.49.0.604.gff1f9ca942-goog
>
> Thanks for posting the patch.
> We are testing this patch using two types of kernels.
> Both kernels are based on version 6.13, one is patched to use the sframe unwinder
> in livepatch [1], the other is patched to not use sframe [2].
>
> [1] https://lore.kernel.org/all/20250127213310.2496133-1-wnliu@google.com/
> [2] https://lore.kernel.org/linux-arm-kernel/20250320171559.3423224-1-song@kernel.org/
>
> For testing, we used a kpatch for arm64 [3] and the package's integration tests.
> In an environment where only the livepatch patch was applied,
> "module-LOADED.test" failed, but after applying this patch, it passed.
> Here are the comments on the test results:[4]
>
> [3] https://github.com/dynup/kpatch/pull/1439
> [4] https://github.com/dynup/kpatch/pull/1439#issuecomment-2781731440
Thanks for the help with testing here.
> I just want to confirm one thing.
> I think the issues this patch solves are the same as those in the previously
> posted patch [5].
> Am I correct in understanding that this is an improved version?
I wasn't aware the patch [5] existed, thanks for the link.
> Furthermore, this patch also supports rewriting data relocation sections,
> which the previously posted patch did not support.
Yes, this is correct. From this perspective, this patch is an improvement.
>
> [5] https://lore.kernel.org/linux-arm-kernel/20211103210709.31790-1-surajjs@amazon.com/
>
> Tested-by: Toshiyuki Sato <fj6611ie@aa.jp.fujitsu.com>
>
> Regards,
> Toshiyuki Sato
>
Thanks,
Dylan
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations.
2025-04-12 1:09 ` [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations Dylan Hatch
2025-04-22 0:35 ` Song Liu
2025-04-23 6:04 ` Toshiyuki Sato (Fujitsu)
@ 2025-04-29 18:26 ` Dylan Hatch
2025-05-09 16:15 ` Will Deacon
3 siblings, 0 replies; 15+ messages in thread
From: Dylan Hatch @ 2025-04-29 18:26 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon
Cc: John Fastabend, Yonghong Song, Eduard Zingerman, Martin KaFai Lau,
Andrii Nakryiko, Daniel Borkmann, Philippe Mathieu-Daudé,
Stanislav Fomichev, KP Singh, Xu Kuohai, Puranjay Mohan,
Jiri Olsa, Hao Luo, Mike Rapoport (Microsoft), Ard Biesheuvel,
Luis Chamberlain, Geert Uytterhoeven, Arnd Bergmann, Song Liu,
Andrew Morton, Mark Rutland, linux-arm-kernel, linux-kernel,
Roman Gushchin
On Fri, Apr 11, 2025 at 6:10 PM Dylan Hatch <dylanbhatch@google.com> wrote:
>
> To enable late module patching, livepatch modules need to be able to
> apply some of their relocations well after being loaded. In this
> scenario, use the text-poking API to allow this, even with
> STRICT_MODULE_RWX.
>
> This patch is largely based off commit 88fc078a7a8f6 ("x86/module: Use
> text_poke() for late relocations").
>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> ---
> arch/arm64/kernel/module.c | 129 ++++++++++++++++++++++++-------------
> 1 file changed, 83 insertions(+), 46 deletions(-)
>
> diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> index 06bb680bfe975..0703502d9dc37 100644
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -18,11 +18,13 @@
> #include <linux/moduleloader.h>
> #include <linux/random.h>
> #include <linux/scs.h>
> +#include <linux/memory.h>
>
> #include <asm/alternative.h>
> #include <asm/insn.h>
> #include <asm/scs.h>
> #include <asm/sections.h>
> +#include <asm/text-patching.h>
>
> enum aarch64_reloc_op {
> RELOC_OP_NONE,
> @@ -48,7 +50,8 @@ static u64 do_reloc(enum aarch64_reloc_op reloc_op, __le32 *place, u64 val)
> return 0;
> }
>
> -static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> +static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
> s64 sval = do_reloc(op, place, val);
>
> @@ -66,7 +69,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
>
> switch (len) {
> case 16:
> - *(s16 *)place = sval;
> + write(place, &sval, sizeof(s16));
> switch (op) {
> case RELOC_OP_ABS:
> if (sval < 0 || sval > U16_MAX)
> @@ -82,7 +85,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> }
> break;
> case 32:
> - *(s32 *)place = sval;
> + write(place, &sval, sizeof(s32));
> switch (op) {
> case RELOC_OP_ABS:
> if (sval < 0 || sval > U32_MAX)
> @@ -98,7 +101,7 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> }
> break;
> case 64:
> - *(s64 *)place = sval;
> + write(place, &sval, sizeof(s64));
> break;
> default:
> pr_err("Invalid length (%d) for data relocation\n", len);
> @@ -113,11 +116,13 @@ enum aarch64_insn_movw_imm_type {
> };
>
> static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
> - int lsb, enum aarch64_insn_movw_imm_type imm_type)
> + int lsb, enum aarch64_insn_movw_imm_type imm_type,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
> u64 imm;
> s64 sval;
> u32 insn = le32_to_cpu(*place);
> + __le32 le_insn;
>
> sval = do_reloc(op, place, val);
> imm = sval >> lsb;
> @@ -145,7 +150,8 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
>
> /* Update the instruction with the new encoding. */
> insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
> - *place = cpu_to_le32(insn);
> + le_insn = cpu_to_le32(insn);
> + write(place, &le_insn, sizeof(le_insn));
>
> if (imm > U16_MAX)
> return -ERANGE;
> @@ -154,11 +160,13 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
> }
>
> static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
> - int lsb, int len, enum aarch64_insn_imm_type imm_type)
> + int lsb, int len, enum aarch64_insn_imm_type imm_type,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
> u64 imm, imm_mask;
> s64 sval;
> u32 insn = le32_to_cpu(*place);
> + __le32 le_insn;
>
> /* Calculate the relocation value. */
> sval = do_reloc(op, place, val);
> @@ -170,7 +178,8 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
>
> /* Update the instruction's immediate field. */
> insn = aarch64_insn_encode_immediate(imm_type, insn, imm);
> - *place = cpu_to_le32(insn);
> + le_insn = cpu_to_le32(insn);
> + write(place, &le_insn, sizeof(le_insn));
>
> /*
> * Extract the upper value bits (including the sign bit) and
> @@ -189,17 +198,19 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
> }
>
> static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> - __le32 *place, u64 val)
> + __le32 *place, u64 val,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
> u32 insn;
> + __le32 le_insn;
>
> if (!is_forbidden_offset_for_adrp(place))
> return reloc_insn_imm(RELOC_OP_PAGE, place, val, 12, 21,
> - AARCH64_INSN_IMM_ADR);
> + AARCH64_INSN_IMM_ADR, write);
>
> /* patch ADRP to ADR if it is in range */
> if (!reloc_insn_imm(RELOC_OP_PREL, place, val & ~0xfff, 0, 21,
> - AARCH64_INSN_IMM_ADR)) {
> + AARCH64_INSN_IMM_ADR, write)) {
> insn = le32_to_cpu(*place);
> insn &= ~BIT(31);
> } else {
> @@ -211,15 +222,17 @@ static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
> AARCH64_INSN_BRANCH_NOLINK);
> }
>
> - *place = cpu_to_le32(insn);
> + le_insn = cpu_to_le32(insn);
> + write(place, &le_insn, sizeof(le_insn));
> return 0;
> }
>
> -int apply_relocate_add(Elf64_Shdr *sechdrs,
> +static int __apply_relocate_add(Elf64_Shdr *sechdrs,
> const char *strtab,
> unsigned int symindex,
> unsigned int relsec,
> - struct module *me)
> + struct module *me,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
> unsigned int i;
> int ovf;
> @@ -255,23 +268,23 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> /* Data relocations. */
> case R_AARCH64_ABS64:
> overflow_check = false;
> - ovf = reloc_data(RELOC_OP_ABS, loc, val, 64);
> + ovf = reloc_data(RELOC_OP_ABS, loc, val, 64, write);
> break;
> case R_AARCH64_ABS32:
> - ovf = reloc_data(RELOC_OP_ABS, loc, val, 32);
> + ovf = reloc_data(RELOC_OP_ABS, loc, val, 32, write);
> break;
> case R_AARCH64_ABS16:
> - ovf = reloc_data(RELOC_OP_ABS, loc, val, 16);
> + ovf = reloc_data(RELOC_OP_ABS, loc, val, 16, write);
> break;
> case R_AARCH64_PREL64:
> overflow_check = false;
> - ovf = reloc_data(RELOC_OP_PREL, loc, val, 64);
> + ovf = reloc_data(RELOC_OP_PREL, loc, val, 64, write);
> break;
> case R_AARCH64_PREL32:
> - ovf = reloc_data(RELOC_OP_PREL, loc, val, 32);
> + ovf = reloc_data(RELOC_OP_PREL, loc, val, 32, write);
> break;
> case R_AARCH64_PREL16:
> - ovf = reloc_data(RELOC_OP_PREL, loc, val, 16);
> + ovf = reloc_data(RELOC_OP_PREL, loc, val, 16, write);
> break;
>
> /* MOVW instruction relocations. */
> @@ -280,88 +293,88 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> fallthrough;
> case R_AARCH64_MOVW_UABS_G0:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_UABS_G1_NC:
> overflow_check = false;
> fallthrough;
> case R_AARCH64_MOVW_UABS_G1:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_UABS_G2_NC:
> overflow_check = false;
> fallthrough;
> case R_AARCH64_MOVW_UABS_G2:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_UABS_G3:
> /* We're using the top bits so we can't overflow. */
> overflow_check = false;
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 48,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_SABS_G0:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_SABS_G1:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_SABS_G2:
> ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G0_NC:
> overflow_check = false;
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G0:
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G1_NC:
> overflow_check = false;
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G1:
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G2_NC:
> overflow_check = false;
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
> - AARCH64_INSN_IMM_MOVKZ);
> + AARCH64_INSN_IMM_MOVKZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G2:
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
> case R_AARCH64_MOVW_PREL_G3:
> /* We're using the top bits so we can't overflow. */
> overflow_check = false;
> ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 48,
> - AARCH64_INSN_IMM_MOVNZ);
> + AARCH64_INSN_IMM_MOVNZ, write);
> break;
>
> /* Immediate instruction relocations. */
> case R_AARCH64_LD_PREL_LO19:
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
> - AARCH64_INSN_IMM_19);
> + AARCH64_INSN_IMM_19, write);
> break;
> case R_AARCH64_ADR_PREL_LO21:
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 0, 21,
> - AARCH64_INSN_IMM_ADR);
> + AARCH64_INSN_IMM_ADR, write);
> break;
> case R_AARCH64_ADR_PREL_PG_HI21_NC:
> overflow_check = false;
> fallthrough;
> case R_AARCH64_ADR_PREL_PG_HI21:
> - ovf = reloc_insn_adrp(me, sechdrs, loc, val);
> + ovf = reloc_insn_adrp(me, sechdrs, loc, val, write);
> if (ovf && ovf != -ERANGE)
> return ovf;
> break;
> @@ -369,46 +382,46 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> case R_AARCH64_LDST8_ABS_LO12_NC:
> overflow_check = false;
> ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 0, 12,
> - AARCH64_INSN_IMM_12);
> + AARCH64_INSN_IMM_12, write);
> break;
> case R_AARCH64_LDST16_ABS_LO12_NC:
> overflow_check = false;
> ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 1, 11,
> - AARCH64_INSN_IMM_12);
> + AARCH64_INSN_IMM_12, write);
> break;
> case R_AARCH64_LDST32_ABS_LO12_NC:
> overflow_check = false;
> ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 2, 10,
> - AARCH64_INSN_IMM_12);
> + AARCH64_INSN_IMM_12, write);
> break;
> case R_AARCH64_LDST64_ABS_LO12_NC:
> overflow_check = false;
> ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 3, 9,
> - AARCH64_INSN_IMM_12);
> + AARCH64_INSN_IMM_12, write);
> break;
> case R_AARCH64_LDST128_ABS_LO12_NC:
> overflow_check = false;
> ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 4, 8,
> - AARCH64_INSN_IMM_12);
> + AARCH64_INSN_IMM_12, write);
> break;
> case R_AARCH64_TSTBR14:
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 14,
> - AARCH64_INSN_IMM_14);
> + AARCH64_INSN_IMM_14, write);
> break;
> case R_AARCH64_CONDBR19:
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
> - AARCH64_INSN_IMM_19);
> + AARCH64_INSN_IMM_19, write);
> break;
> case R_AARCH64_JUMP26:
> case R_AARCH64_CALL26:
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
> - AARCH64_INSN_IMM_26);
> + AARCH64_INSN_IMM_26, write);
> if (ovf == -ERANGE) {
> val = module_emit_plt_entry(me, sechdrs, loc, &rel[i], sym);
> if (!val)
> return -ENOEXEC;
> ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
> - 26, AARCH64_INSN_IMM_26);
> + 26, AARCH64_INSN_IMM_26, write);
> }
> break;
>
> @@ -431,6 +444,30 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
> return -ENOEXEC;
> }
>
> +int apply_relocate_add(Elf64_Shdr *sechdrs,
> + const char *strtab,
> + unsigned int symindex,
> + unsigned int relsec,
> + struct module *me)
> +{
> + int ret;
> + bool early = me->state == MODULE_STATE_UNFORMED;
> + void *(*write)(void *, const void *, size_t) = memcpy;
> +
> + if (!early) {
> + write = text_poke;
> + mutex_lock(&text_mutex);
> + }
> +
> + ret = __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
> + write);
> +
> + if (!early)
> + mutex_unlock(&text_mutex);
> +
> + return ret;
> +}
> +
> static inline void __init_plt(struct plt_entry *plt, unsigned long addr)
> {
> *plt = get_plt_entry(addr, plt);
> --
> 2.49.0.604.gff1f9ca942-goog
>
Catalin and Will,
Are you comfortable with taking these patches? Is this a workable
approach to enable livepatching arm64 modules?
Thanks,
Dylan
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations.
2025-04-12 1:09 ` [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations Dylan Hatch
` (2 preceding siblings ...)
2025-04-29 18:26 ` Dylan Hatch
@ 2025-05-09 16:15 ` Will Deacon
2025-05-09 16:39 ` Song Liu
2025-05-22 23:35 ` Dylan Hatch
3 siblings, 2 replies; 15+ messages in thread
From: Will Deacon @ 2025-05-09 16:15 UTC (permalink / raw)
To: Dylan Hatch
Cc: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Song Liu, Ard Biesheuvel,
Mark Rutland, linux-arm-kernel, linux-kernel, Roman Gushchin
Hi Dylan,
On Sat, Apr 12, 2025 at 01:09:39AM +0000, Dylan Hatch wrote:
> To enable late module patching, livepatch modules need to be able to
> apply some of their relocations well after being loaded. In this
> scenario, use the text-poking API to allow this, even with
> STRICT_MODULE_RWX.
>
> This patch is largely based off commit 88fc078a7a8f6 ("x86/module: Use
> text_poke() for late relocations").
>
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> ---
> arch/arm64/kernel/module.c | 129 ++++++++++++++++++++++++-------------
> 1 file changed, 83 insertions(+), 46 deletions(-)
On its own, this isn't gaining us an awful lot upstream as we don't have
livepatch support (arm64 does not select HAVE_LIVEPATCH), however I'm
not against incremental changes towards enabling that. Are you planning
to work on follow-up changes for the rest of the support?
> diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> index 06bb680bfe975..0703502d9dc37 100644
> --- a/arch/arm64/kernel/module.c
> +++ b/arch/arm64/kernel/module.c
> @@ -18,11 +18,13 @@
> #include <linux/moduleloader.h>
> #include <linux/random.h>
> #include <linux/scs.h>
> +#include <linux/memory.h>
>
> #include <asm/alternative.h>
> #include <asm/insn.h>
> #include <asm/scs.h>
> #include <asm/sections.h>
> +#include <asm/text-patching.h>
>
> enum aarch64_reloc_op {
> RELOC_OP_NONE,
> @@ -48,7 +50,8 @@ static u64 do_reloc(enum aarch64_reloc_op reloc_op, __le32 *place, u64 val)
> return 0;
> }
>
> -static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> +static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
I think it's a bit grotty indirecting the write when in reality we either
need a straight-forward assignment (like we have today) or a call to
an instruction-patching helper.
In particular, when you get to functions such as:
> static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
> - int lsb, enum aarch64_insn_movw_imm_type imm_type)
> + int lsb, enum aarch64_insn_movw_imm_type imm_type,
> + void *(*write)(void *dest, const void *src, size_t len))
> {
> u64 imm;
> s64 sval;
> u32 insn = le32_to_cpu(*place);
> + __le32 le_insn;
>
> sval = do_reloc(op, place, val);
> imm = sval >> lsb;
> @@ -145,7 +150,8 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
>
> /* Update the instruction with the new encoding. */
> insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
> - *place = cpu_to_le32(insn);
> + le_insn = cpu_to_le32(insn);
> + write(place, &le_insn, sizeof(le_insn));
we're forced into passing &le_insn because we need the same function
prototype as memcpy().
Instead, how about we pass the 'struct module *mod' pointer down from
apply_relocate_add()? That's already done for reloc_insn_adrp() and it
would mean that the above could be written as:
static int reloc_insn_movw(struct module *mod, enum aarch64_reloc_op op,
__le32 *place, u64 val, int lsb,
enum aarch64_insn_movw_imm_type imm_type)
{
...
insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
insn = cpu_to_le32(insn);
if (mod->state != MODULE_STATE_UNFORMED)
aarch64_insn_set(place, insn, sizeof(insn));
else
*place = insn;
meaning we can also drop the first patch, because I don't think we need
a text_poke() helper.
> +int apply_relocate_add(Elf64_Shdr *sechdrs,
> + const char *strtab,
> + unsigned int symindex,
> + unsigned int relsec,
> + struct module *me)
> +{
> + int ret;
> + bool early = me->state == MODULE_STATE_UNFORMED;
> + void *(*write)(void *, const void *, size_t) = memcpy;
> +
> + if (!early) {
> + write = text_poke;
> + mutex_lock(&text_mutex);
> + }
> +
> + ret = __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
> + write);
> +
> + if (!early)
> + mutex_unlock(&text_mutex);
Why is the responsibility of the arch code to take the 'text_mutex' here?
The core code should be able to do that when the module state is !=
MODULE_STATE_UNFORMED.
Cheers,
Will
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations.
2025-05-09 16:15 ` Will Deacon
@ 2025-05-09 16:39 ` Song Liu
2025-05-22 23:35 ` Dylan Hatch
1 sibling, 0 replies; 15+ messages in thread
From: Song Liu @ 2025-05-09 16:39 UTC (permalink / raw)
To: Will Deacon
Cc: Dylan Hatch, Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Ard Biesheuvel, Mark Rutland,
linux-arm-kernel, linux-kernel, Roman Gushchin
Hi Will,
On Fri, May 9, 2025 at 9:15 AM Will Deacon <will@kernel.org> wrote:
>
> Hi Dylan,
>
> On Sat, Apr 12, 2025 at 01:09:39AM +0000, Dylan Hatch wrote:
> > To enable late module patching, livepatch modules need to be able to
> > apply some of their relocations well after being loaded. In this
> > scenario, use the text-poking API to allow this, even with
> > STRICT_MODULE_RWX.
> >
> > This patch is largely based off commit 88fc078a7a8f6 ("x86/module: Use
> > text_poke() for late relocations").
> >
> > Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> > ---
> > arch/arm64/kernel/module.c | 129 ++++++++++++++++++++++++-------------
> > 1 file changed, 83 insertions(+), 46 deletions(-)
>
> On its own, this isn't gaining us an awful lot upstream as we don't have
> livepatch support (arm64 does not select HAVE_LIVEPATCH), however I'm
> not against incremental changes towards enabling that. Are you planning
> to work on follow-up changes for the rest of the support?
There are two patchsets that are trying to enable HAVE_LIVEPATCH
for arm64:
[1] https://lore.kernel.org/all/20250127213310.2496133-1-wnliu@google.com/
[2] https://lore.kernel.org/linux-arm-kernel/20250320171559.3423224-1-song@kernel.org/
Toshiyuki has tested this patch with both approaches.
Since patchset [1] depends on SFrame, which is less mature at the
moment (clang doesn't support it, kernel side code is very new), live patch
folks think [2] is a better approach at the moment. Could you please share
your thoughts on [2]. If it looks good, we hope to ship it to 6.16 kernels, as
we (Meta) want to use livepatch in our fleet.
Thanks,
Song
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations.
2025-05-09 16:15 ` Will Deacon
2025-05-09 16:39 ` Song Liu
@ 2025-05-22 23:35 ` Dylan Hatch
1 sibling, 0 replies; 15+ messages in thread
From: Dylan Hatch @ 2025-05-22 23:35 UTC (permalink / raw)
To: Will Deacon
Cc: Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau,
Eduard Zingerman, Yonghong Song, John Fastabend, KP Singh,
Stanislav Fomichev, Hao Luo, Jiri Olsa, Puranjay Mohan, Xu Kuohai,
Philippe Mathieu-Daudé, Catalin Marinas,
Mike Rapoport (Microsoft), Arnd Bergmann, Geert Uytterhoeven,
Luis Chamberlain, Andrew Morton, Song Liu, Ard Biesheuvel,
Mark Rutland, linux-arm-kernel, linux-kernel, Roman Gushchin
Hi Will,
I've sent a v4 of these patches that should address your comments,
feel free to have a look.
On Fri, May 9, 2025 at 9:15 AM Will Deacon <will@kernel.org> wrote:
>
> Hi Dylan,
>
> On Sat, Apr 12, 2025 at 01:09:39AM +0000, Dylan Hatch wrote:
> > To enable late module patching, livepatch modules need to be able to
> > apply some of their relocations well after being loaded. In this
> > scenario, use the text-poking API to allow this, even with
> > STRICT_MODULE_RWX.
> >
> > This patch is largely based off commit 88fc078a7a8f6 ("x86/module: Use
> > text_poke() for late relocations").
> >
> > Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> > ---
> > arch/arm64/kernel/module.c | 129 ++++++++++++++++++++++++-------------
> > 1 file changed, 83 insertions(+), 46 deletions(-)
>
> On its own, this isn't gaining us an awful lot upstream as we don't have
> livepatch support (arm64 does not select HAVE_LIVEPATCH), however I'm
> not against incremental changes towards enabling that. Are you planning
> to work on follow-up changes for the rest of the support?
As Song mentioned, hopefully
https://lore.kernel.org/linux-arm-kernel/20250320171559.3423224-1-song@kernel.org/
in combination with this patch should be enough for initial support of
livepatch on arm64. I'll need to follow up with Weinan on next steps
for the SFrame approach.
>
> > diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
> > index 06bb680bfe975..0703502d9dc37 100644
> > --- a/arch/arm64/kernel/module.c
> > +++ b/arch/arm64/kernel/module.c
> > @@ -18,11 +18,13 @@
> > #include <linux/moduleloader.h>
> > #include <linux/random.h>
> > #include <linux/scs.h>
> > +#include <linux/memory.h>
> >
> > #include <asm/alternative.h>
> > #include <asm/insn.h>
> > #include <asm/scs.h>
> > #include <asm/sections.h>
> > +#include <asm/text-patching.h>
> >
> > enum aarch64_reloc_op {
> > RELOC_OP_NONE,
> > @@ -48,7 +50,8 @@ static u64 do_reloc(enum aarch64_reloc_op reloc_op, __le32 *place, u64 val)
> > return 0;
> > }
> >
> > -static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
> > +static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len,
> > + void *(*write)(void *dest, const void *src, size_t len))
> > {
>
> I think it's a bit grotty indirecting the write when in reality we either
> need a straight-forward assignment (like we have today) or a call to
> an instruction-patching helper.
>
> In particular, when you get to functions such as:
>
> > static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
> > - int lsb, enum aarch64_insn_movw_imm_type imm_type)
> > + int lsb, enum aarch64_insn_movw_imm_type imm_type,
> > + void *(*write)(void *dest, const void *src, size_t len))
> > {
> > u64 imm;
> > s64 sval;
> > u32 insn = le32_to_cpu(*place);
> > + __le32 le_insn;
> >
> > sval = do_reloc(op, place, val);
> > imm = sval >> lsb;
> > @@ -145,7 +150,8 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
> >
> > /* Update the instruction with the new encoding. */
> > insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
> > - *place = cpu_to_le32(insn);
> > + le_insn = cpu_to_le32(insn);
> > + write(place, &le_insn, sizeof(le_insn));
>
> we're forced into passing &le_insn because we need the same function
> prototype as memcpy().
>
> Instead, how about we pass the 'struct module *mod' pointer down from
> apply_relocate_add()? That's already done for reloc_insn_adrp() and it
> would mean that the above could be written as:
>
>
> static int reloc_insn_movw(struct module *mod, enum aarch64_reloc_op op,
> __le32 *place, u64 val, int lsb,
> enum aarch64_insn_movw_imm_type imm_type)
> {
> ...
>
> insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
> insn = cpu_to_le32(insn);
>
> if (mod->state != MODULE_STATE_UNFORMED)
> aarch64_insn_set(place, insn, sizeof(insn));
> else
> *place = insn;
>
>
> meaning we can also drop the first patch, because I don't think we need
> a text_poke() helper.
Dropped the first patch in v4 and switched to the method suggested
here, so we use a normal assignment for the non-late case. Though, it
does seem a little repetitive, with 6 different sites doing this
module state check. If having a straightforward assignment directly in
the reloc_* functions isn't too important, I wonder if we can make
local memset/memcpy-like helpers to contain this check? Like:
static void *write_insn(void *place, u32 insn, struct module *me);
static void *write_data(void *place, s64 *sval, struct module *me);
>
> > +int apply_relocate_add(Elf64_Shdr *sechdrs,
> > + const char *strtab,
> > + unsigned int symindex,
> > + unsigned int relsec,
> > + struct module *me)
> > +{
> > + int ret;
> > + bool early = me->state == MODULE_STATE_UNFORMED;
> > + void *(*write)(void *, const void *, size_t) = memcpy;
> > +
> > + if (!early) {
> > + write = text_poke;
> > + mutex_lock(&text_mutex);
> > + }
> > +
> > + ret = __apply_relocate_add(sechdrs, strtab, symindex, relsec, me,
> > + write);
> > +
> > + if (!early)
> > + mutex_unlock(&text_mutex);
>
> Why is the responsibility of the arch code to take the 'text_mutex' here?
> The core code should be able to do that when the module state is !=
> MODULE_STATE_UNFORMED.
>
Moved the locking to kernel/livepatch/core.c in v4, since other call
sites to apply_relocate_add() don't attempt late relocations. Since
the locking was already being done in the x86 module code I had to
remove this. It also made sense to me to split out the text_mutex
changes into a separate patch because they only touch module/x86 and
livepatch code, so that's how I did it in v4. But they can just as
easily be squashed into one patch.
Thanks,
Dylan
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-05-22 23:46 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-12 1:09 [PATCH v2 0/2] arm64/module: Enable late module relocations Dylan Hatch
2025-04-12 1:09 ` [PATCH v2 1/2] arm64: patching: Rename aarch64_insn_copy to text_poke Dylan Hatch
2025-04-21 22:14 ` Dylan Hatch
2025-04-22 0:32 ` Song Liu
2025-04-12 1:09 ` [PATCH v2 2/2] arm64/module: Use text-poke API for late relocations Dylan Hatch
2025-04-22 0:35 ` Song Liu
2025-04-22 6:25 ` Song Liu
2025-04-23 0:27 ` Dylan Hatch
2025-04-23 4:42 ` Song Liu
2025-04-23 6:04 ` Toshiyuki Sato (Fujitsu)
2025-04-23 22:59 ` Dylan Hatch
2025-04-29 18:26 ` Dylan Hatch
2025-05-09 16:15 ` Will Deacon
2025-05-09 16:39 ` Song Liu
2025-05-22 23:35 ` Dylan Hatch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).