linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/2] livepatch, arm64/module: Enable late module relocations.
@ 2025-05-22 20:52 Dylan Hatch
  2025-05-22 20:52 ` [PATCH v4 1/2] livepatch, x86/module: Generalize late module relocation locking Dylan Hatch
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Dylan Hatch @ 2025-05-22 20:52 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Josh Poimboeuf,
	Jiri Kosina, Miroslav Benes, Petr Mladek, Joe Lawrence
  Cc: Dylan Hatch, Song Liu, Ard Biesheuvel, Sami Tolvanen,
	Peter Zijlstra, Mike Rapoport (Microsoft), Andrew Morton,
	Dan Carpenter, linux-arm-kernel, linux-kernel, live-patching,
	Roman Gushchin, Toshiyuki Sato

Late relocations (after the module is initially loaded) are needed when
livepatches change module code. This is supported by x86, ppc, and s390.
This series borrows the x86 methodology to reach the same level of
support on arm64, and moves the text-poke locking into the core livepatch
code to reduce redundancy.

Dylan Hatch (2):
  livepatch, x86/module: Generalize late module relocation locking.
  arm64/module: Use text-poke API for late relocations.

 arch/arm64/kernel/module.c | 113 ++++++++++++++++++++++---------------
 arch/x86/kernel/module.c   |   8 +--
 kernel/livepatch/core.c    |  18 ++++--
 3 files changed, 84 insertions(+), 55 deletions(-)

-- 
2.49.0.1151.ga128411c76-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v4 1/2] livepatch, x86/module: Generalize late module relocation locking.
  2025-05-22 20:52 [PATCH v4 0/2] livepatch, arm64/module: Enable late module relocations Dylan Hatch
@ 2025-05-22 20:52 ` Dylan Hatch
  2025-05-27 14:46   ` Petr Mladek
  2025-05-22 20:52 ` [PATCH v4 2/2] arm64/module: Use text-poke API for late relocations Dylan Hatch
  2025-05-23  5:55 ` [PATCH v4 0/2] livepatch, arm64/module: Enable late module relocations Toshiyuki Sato (Fujitsu)
  2 siblings, 1 reply; 6+ messages in thread
From: Dylan Hatch @ 2025-05-22 20:52 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Josh Poimboeuf,
	Jiri Kosina, Miroslav Benes, Petr Mladek, Joe Lawrence
  Cc: Dylan Hatch, Song Liu, Ard Biesheuvel, Sami Tolvanen,
	Peter Zijlstra, Mike Rapoport (Microsoft), Andrew Morton,
	Dan Carpenter, linux-arm-kernel, linux-kernel, live-patching,
	Roman Gushchin, Toshiyuki Sato

Late module relocations are an issue on any arch that supports
livepatch, so move the text_mutex locking to the livepatch core code.

Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
Acked-by: Song Liu <song@kernel.org>
---
 arch/x86/kernel/module.c |  8 ++------
 kernel/livepatch/core.c  | 18 +++++++++++++-----
 2 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index ff07558b7ebc6..38767e0047d0c 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -197,18 +197,14 @@ static int write_relocate_add(Elf64_Shdr *sechdrs,
 	bool early = me->state == MODULE_STATE_UNFORMED;
 	void *(*write)(void *, const void *, size_t) = memcpy;
 
-	if (!early) {
+	if (!early)
 		write = text_poke;
-		mutex_lock(&text_mutex);
-	}
 
 	ret = __write_relocate_add(sechdrs, strtab, symindex, relsec, me,
 				   write, apply);
 
-	if (!early) {
+	if (!early)
 		text_poke_sync();
-		mutex_unlock(&text_mutex);
-	}
 
 	return ret;
 }
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 0e73fac55f8eb..9968441f73510 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -294,9 +294,10 @@ static int klp_write_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
 				    unsigned int symndx, unsigned int secndx,
 				    const char *objname, bool apply)
 {
-	int cnt, ret;
+	int cnt, ret = 0;
 	char sec_objname[MODULE_NAME_LEN];
 	Elf_Shdr *sec = sechdrs + secndx;
+	bool early = pmod->state == MODULE_STATE_UNFORMED;
 
 	/*
 	 * Format: .klp.rela.sec_objname.section_name
@@ -319,12 +320,19 @@ static int klp_write_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
 					  sec, sec_objname);
 		if (ret)
 			return ret;
-
-		return apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
 	}
 
-	clear_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
-	return 0;
+	if (!early)
+		mutex_lock(&text_mutex);
+
+	if (apply)
+		ret = apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
+	else
+		clear_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
+
+	if (!early)
+		mutex_unlock(&text_mutex);
+	return ret;
 }
 
 int klp_apply_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
-- 
2.49.0.1151.ga128411c76-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v4 2/2] arm64/module: Use text-poke API for late relocations.
  2025-05-22 20:52 [PATCH v4 0/2] livepatch, arm64/module: Enable late module relocations Dylan Hatch
  2025-05-22 20:52 ` [PATCH v4 1/2] livepatch, x86/module: Generalize late module relocation locking Dylan Hatch
@ 2025-05-22 20:52 ` Dylan Hatch
  2025-05-23  5:55 ` [PATCH v4 0/2] livepatch, arm64/module: Enable late module relocations Toshiyuki Sato (Fujitsu)
  2 siblings, 0 replies; 6+ messages in thread
From: Dylan Hatch @ 2025-05-22 20:52 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Josh Poimboeuf,
	Jiri Kosina, Miroslav Benes, Petr Mladek, Joe Lawrence
  Cc: Dylan Hatch, Song Liu, Ard Biesheuvel, Sami Tolvanen,
	Peter Zijlstra, Mike Rapoport (Microsoft), Andrew Morton,
	Dan Carpenter, linux-arm-kernel, linux-kernel, live-patching,
	Roman Gushchin, Toshiyuki Sato

To enable late module patching, livepatch modules need to be able to
apply some of their relocations well after being loaded. In this
scenario, use the text-poking API to allow this, even with
STRICT_MODULE_RWX.

This patch is partially based off commit 88fc078a7a8f6 ("x86/module: Use
text_poke() for late relocations").

Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
Acked-by: Song Liu <song@kernel.org>
---
 arch/arm64/kernel/module.c | 113 ++++++++++++++++++++++---------------
 1 file changed, 69 insertions(+), 44 deletions(-)

diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
index 06bb680bfe975..6fbc3dbdcb425 100644
--- a/arch/arm64/kernel/module.c
+++ b/arch/arm64/kernel/module.c
@@ -18,11 +18,13 @@
 #include <linux/moduleloader.h>
 #include <linux/random.h>
 #include <linux/scs.h>
+#include <linux/memory.h>
 
 #include <asm/alternative.h>
 #include <asm/insn.h>
 #include <asm/scs.h>
 #include <asm/sections.h>
+#include <asm/text-patching.h>
 
 enum aarch64_reloc_op {
 	RELOC_OP_NONE,
@@ -48,7 +50,8 @@ static u64 do_reloc(enum aarch64_reloc_op reloc_op, __le32 *place, u64 val)
 	return 0;
 }
 
-static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
+static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len,
+		      struct module *me)
 {
 	s64 sval = do_reloc(op, place, val);
 
@@ -66,7 +69,11 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
 
 	switch (len) {
 	case 16:
-		*(s16 *)place = sval;
+		if (me->state != MODULE_STATE_UNFORMED)
+			aarch64_insn_copy(place, &sval, sizeof(s16));
+		else
+			*(s16 *)place = sval;
+
 		switch (op) {
 		case RELOC_OP_ABS:
 			if (sval < 0 || sval > U16_MAX)
@@ -82,7 +89,11 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
 		}
 		break;
 	case 32:
-		*(s32 *)place = sval;
+		if (me->state != MODULE_STATE_UNFORMED)
+			aarch64_insn_copy(place, &sval, sizeof(s32));
+		else
+			*(s32 *)place = sval;
+
 		switch (op) {
 		case RELOC_OP_ABS:
 			if (sval < 0 || sval > U32_MAX)
@@ -98,7 +109,10 @@ static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
 		}
 		break;
 	case 64:
-		*(s64 *)place = sval;
+		if (me->state != MODULE_STATE_UNFORMED)
+			aarch64_insn_copy(place, &sval, sizeof(s64));
+		else
+			*(s64 *)place = sval;
 		break;
 	default:
 		pr_err("Invalid length (%d) for data relocation\n", len);
@@ -113,7 +127,8 @@ enum aarch64_insn_movw_imm_type {
 };
 
 static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
-			   int lsb, enum aarch64_insn_movw_imm_type imm_type)
+			   int lsb, enum aarch64_insn_movw_imm_type imm_type,
+			   struct module *me)
 {
 	u64 imm;
 	s64 sval;
@@ -145,7 +160,10 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
 
 	/* Update the instruction with the new encoding. */
 	insn = aarch64_insn_encode_immediate(AARCH64_INSN_IMM_16, insn, imm);
-	*place = cpu_to_le32(insn);
+	if (me->state != MODULE_STATE_UNFORMED)
+		aarch64_insn_set(place, cpu_to_le32(insn), sizeof(insn));
+	else
+		*place = cpu_to_le32(insn);
 
 	if (imm > U16_MAX)
 		return -ERANGE;
@@ -154,7 +172,8 @@ static int reloc_insn_movw(enum aarch64_reloc_op op, __le32 *place, u64 val,
 }
 
 static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
-			  int lsb, int len, enum aarch64_insn_imm_type imm_type)
+			  int lsb, int len, enum aarch64_insn_imm_type imm_type,
+			  struct module *me)
 {
 	u64 imm, imm_mask;
 	s64 sval;
@@ -170,7 +189,10 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
 
 	/* Update the instruction's immediate field. */
 	insn = aarch64_insn_encode_immediate(imm_type, insn, imm);
-	*place = cpu_to_le32(insn);
+	if (me->state != MODULE_STATE_UNFORMED)
+		aarch64_insn_set(place, cpu_to_le32(insn), sizeof(insn));
+	else
+		*place = cpu_to_le32(insn);
 
 	/*
 	 * Extract the upper value bits (including the sign bit) and
@@ -189,17 +211,17 @@ static int reloc_insn_imm(enum aarch64_reloc_op op, __le32 *place, u64 val,
 }
 
 static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
-			   __le32 *place, u64 val)
+			   __le32 *place, u64 val, struct module *me)
 {
 	u32 insn;
 
 	if (!is_forbidden_offset_for_adrp(place))
 		return reloc_insn_imm(RELOC_OP_PAGE, place, val, 12, 21,
-				      AARCH64_INSN_IMM_ADR);
+				      AARCH64_INSN_IMM_ADR, me);
 
 	/* patch ADRP to ADR if it is in range */
 	if (!reloc_insn_imm(RELOC_OP_PREL, place, val & ~0xfff, 0, 21,
-			    AARCH64_INSN_IMM_ADR)) {
+			    AARCH64_INSN_IMM_ADR, me)) {
 		insn = le32_to_cpu(*place);
 		insn &= ~BIT(31);
 	} else {
@@ -211,7 +233,10 @@ static int reloc_insn_adrp(struct module *mod, Elf64_Shdr *sechdrs,
 						   AARCH64_INSN_BRANCH_NOLINK);
 	}
 
-	*place = cpu_to_le32(insn);
+	if (me->state != MODULE_STATE_UNFORMED)
+		aarch64_insn_set(place, cpu_to_le32(insn), sizeof(insn));
+	else
+		*place = cpu_to_le32(insn);
 	return 0;
 }
 
@@ -255,23 +280,23 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		/* Data relocations. */
 		case R_AARCH64_ABS64:
 			overflow_check = false;
-			ovf = reloc_data(RELOC_OP_ABS, loc, val, 64);
+			ovf = reloc_data(RELOC_OP_ABS, loc, val, 64, me);
 			break;
 		case R_AARCH64_ABS32:
-			ovf = reloc_data(RELOC_OP_ABS, loc, val, 32);
+			ovf = reloc_data(RELOC_OP_ABS, loc, val, 32, me);
 			break;
 		case R_AARCH64_ABS16:
-			ovf = reloc_data(RELOC_OP_ABS, loc, val, 16);
+			ovf = reloc_data(RELOC_OP_ABS, loc, val, 16, me);
 			break;
 		case R_AARCH64_PREL64:
 			overflow_check = false;
-			ovf = reloc_data(RELOC_OP_PREL, loc, val, 64);
+			ovf = reloc_data(RELOC_OP_PREL, loc, val, 64, me);
 			break;
 		case R_AARCH64_PREL32:
-			ovf = reloc_data(RELOC_OP_PREL, loc, val, 32);
+			ovf = reloc_data(RELOC_OP_PREL, loc, val, 32, me);
 			break;
 		case R_AARCH64_PREL16:
-			ovf = reloc_data(RELOC_OP_PREL, loc, val, 16);
+			ovf = reloc_data(RELOC_OP_PREL, loc, val, 16, me);
 			break;
 
 		/* MOVW instruction relocations. */
@@ -280,88 +305,88 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 			fallthrough;
 		case R_AARCH64_MOVW_UABS_G0:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
-					      AARCH64_INSN_IMM_MOVKZ);
+					      AARCH64_INSN_IMM_MOVKZ, me);
 			break;
 		case R_AARCH64_MOVW_UABS_G1_NC:
 			overflow_check = false;
 			fallthrough;
 		case R_AARCH64_MOVW_UABS_G1:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
-					      AARCH64_INSN_IMM_MOVKZ);
+					      AARCH64_INSN_IMM_MOVKZ, me);
 			break;
 		case R_AARCH64_MOVW_UABS_G2_NC:
 			overflow_check = false;
 			fallthrough;
 		case R_AARCH64_MOVW_UABS_G2:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
-					      AARCH64_INSN_IMM_MOVKZ);
+					      AARCH64_INSN_IMM_MOVKZ, me);
 			break;
 		case R_AARCH64_MOVW_UABS_G3:
 			/* We're using the top bits so we can't overflow. */
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 48,
-					      AARCH64_INSN_IMM_MOVKZ);
+					      AARCH64_INSN_IMM_MOVKZ, me);
 			break;
 		case R_AARCH64_MOVW_SABS_G0:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
-					      AARCH64_INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ, me);
 			break;
 		case R_AARCH64_MOVW_SABS_G1:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
-					      AARCH64_INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ, me);
 			break;
 		case R_AARCH64_MOVW_SABS_G2:
 			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
-					      AARCH64_INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ, me);
 			break;
 		case R_AARCH64_MOVW_PREL_G0_NC:
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
-					      AARCH64_INSN_IMM_MOVKZ);
+					      AARCH64_INSN_IMM_MOVKZ, me);
 			break;
 		case R_AARCH64_MOVW_PREL_G0:
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
-					      AARCH64_INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ, me);
 			break;
 		case R_AARCH64_MOVW_PREL_G1_NC:
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
-					      AARCH64_INSN_IMM_MOVKZ);
+					      AARCH64_INSN_IMM_MOVKZ, me);
 			break;
 		case R_AARCH64_MOVW_PREL_G1:
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
-					      AARCH64_INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ, me);
 			break;
 		case R_AARCH64_MOVW_PREL_G2_NC:
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
-					      AARCH64_INSN_IMM_MOVKZ);
+					      AARCH64_INSN_IMM_MOVKZ, me);
 			break;
 		case R_AARCH64_MOVW_PREL_G2:
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
-					      AARCH64_INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ, me);
 			break;
 		case R_AARCH64_MOVW_PREL_G3:
 			/* We're using the top bits so we can't overflow. */
 			overflow_check = false;
 			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 48,
-					      AARCH64_INSN_IMM_MOVNZ);
+					      AARCH64_INSN_IMM_MOVNZ, me);
 			break;
 
 		/* Immediate instruction relocations. */
 		case R_AARCH64_LD_PREL_LO19:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
-					     AARCH64_INSN_IMM_19);
+					     AARCH64_INSN_IMM_19, me);
 			break;
 		case R_AARCH64_ADR_PREL_LO21:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 0, 21,
-					     AARCH64_INSN_IMM_ADR);
+					     AARCH64_INSN_IMM_ADR, me);
 			break;
 		case R_AARCH64_ADR_PREL_PG_HI21_NC:
 			overflow_check = false;
 			fallthrough;
 		case R_AARCH64_ADR_PREL_PG_HI21:
-			ovf = reloc_insn_adrp(me, sechdrs, loc, val);
+			ovf = reloc_insn_adrp(me, sechdrs, loc, val, me);
 			if (ovf && ovf != -ERANGE)
 				return ovf;
 			break;
@@ -369,46 +394,46 @@ int apply_relocate_add(Elf64_Shdr *sechdrs,
 		case R_AARCH64_LDST8_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 0, 12,
-					     AARCH64_INSN_IMM_12);
+					     AARCH64_INSN_IMM_12, me);
 			break;
 		case R_AARCH64_LDST16_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 1, 11,
-					     AARCH64_INSN_IMM_12);
+					     AARCH64_INSN_IMM_12, me);
 			break;
 		case R_AARCH64_LDST32_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 2, 10,
-					     AARCH64_INSN_IMM_12);
+					     AARCH64_INSN_IMM_12, me);
 			break;
 		case R_AARCH64_LDST64_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 3, 9,
-					     AARCH64_INSN_IMM_12);
+					     AARCH64_INSN_IMM_12, me);
 			break;
 		case R_AARCH64_LDST128_ABS_LO12_NC:
 			overflow_check = false;
 			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 4, 8,
-					     AARCH64_INSN_IMM_12);
+					     AARCH64_INSN_IMM_12, me);
 			break;
 		case R_AARCH64_TSTBR14:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 14,
-					     AARCH64_INSN_IMM_14);
+					     AARCH64_INSN_IMM_14, me);
 			break;
 		case R_AARCH64_CONDBR19:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
-					     AARCH64_INSN_IMM_19);
+					     AARCH64_INSN_IMM_19, me);
 			break;
 		case R_AARCH64_JUMP26:
 		case R_AARCH64_CALL26:
 			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
-					     AARCH64_INSN_IMM_26);
+					     AARCH64_INSN_IMM_26, me);
 			if (ovf == -ERANGE) {
 				val = module_emit_plt_entry(me, sechdrs, loc, &rel[i], sym);
 				if (!val)
 					return -ENOEXEC;
 				ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2,
-						     26, AARCH64_INSN_IMM_26);
+						     26, AARCH64_INSN_IMM_26, me);
 			}
 			break;
 
-- 
2.49.0.1151.ga128411c76-goog


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [PATCH v4 0/2] livepatch, arm64/module: Enable late module relocations.
  2025-05-22 20:52 [PATCH v4 0/2] livepatch, arm64/module: Enable late module relocations Dylan Hatch
  2025-05-22 20:52 ` [PATCH v4 1/2] livepatch, x86/module: Generalize late module relocation locking Dylan Hatch
  2025-05-22 20:52 ` [PATCH v4 2/2] arm64/module: Use text-poke API for late relocations Dylan Hatch
@ 2025-05-23  5:55 ` Toshiyuki Sato (Fujitsu)
  2 siblings, 0 replies; 6+ messages in thread
From: Toshiyuki Sato (Fujitsu) @ 2025-05-23  5:55 UTC (permalink / raw)
  To: 'Dylan Hatch'
  Cc: Catalin Marinas, Will Deacon, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86@kernel.org, H. Peter Anvin,
	Josh Poimboeuf, Jiri Kosina, Miroslav Benes, Petr Mladek,
	Joe Lawrence, Song Liu, Ard Biesheuvel, Sami Tolvanen,
	Peter Zijlstra, Mike Rapoport (Microsoft), Andrew Morton,
	Dan Carpenter, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, live-patching@vger.kernel.org,
	Roman Gushchin, Toshiyuki Sato (Fujitsu)

Hi Dylan,

> Late relocations (after the module is initially loaded) are needed when
> livepatches change module code. This is supported by x86, ppc, and s390.
> This series borrows the x86 methodology to reach the same level of support on
> arm64, and moves the text-poke locking into the core livepatch code to reduce
> redundancy.
> 
> Dylan Hatch (2):
>   livepatch, x86/module: Generalize late module relocation locking.
>   arm64/module: Use text-poke API for late relocations.
> 
>  arch/arm64/kernel/module.c | 113
> ++++++++++++++++++++++---------------
>  arch/x86/kernel/module.c   |   8 +--
>  kernel/livepatch/core.c    |  18 ++++--
>  3 files changed, 84 insertions(+), 55 deletions(-)
> 
> --
> 2.49.0.1151.ga128411c76-goog

Thanks for posting the new patch.

I ran kpatch's integration tests and no issues were detected.

The livepatch patches [1][2] (Manually adjusting arch/arm64/Kconfig) have been applied to the kernel (6.15-rc7).
The kpatch uses the same one as the previous test [3][4].

[1] https://lore.kernel.org/all/20250521111000.2237470-1-mark.rutland@arm.com/
[2] https://lore.kernel.org/all/20250320171559.3423224-3-song@kernel.org/
[3] https://lore.kernel.org/all/TY4PR01MB1377739F1CC08549A619C8635D7BA2@TY4PR01MB13777.jpnprd01.prod.outlook.com/
[4] https://github.com/dynup/kpatch/pull/1439

Tested-by: Toshiyuki Sato <fj6611ie@aa.jp.fujitsu.com>

Regards,
Toshiyuki Sato


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 1/2] livepatch, x86/module: Generalize late module relocation locking.
  2025-05-22 20:52 ` [PATCH v4 1/2] livepatch, x86/module: Generalize late module relocation locking Dylan Hatch
@ 2025-05-27 14:46   ` Petr Mladek
  2025-05-29  0:47     ` Dylan Hatch
  0 siblings, 1 reply; 6+ messages in thread
From: Petr Mladek @ 2025-05-27 14:46 UTC (permalink / raw)
  To: Dylan Hatch
  Cc: Catalin Marinas, Will Deacon, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Josh Poimboeuf,
	Jiri Kosina, Miroslav Benes, Joe Lawrence, Song Liu,
	Ard Biesheuvel, Sami Tolvanen, Peter Zijlstra,
	Mike Rapoport (Microsoft), Andrew Morton, Dan Carpenter,
	linux-arm-kernel, linux-kernel, live-patching, Roman Gushchin,
	Toshiyuki Sato

On Thu 2025-05-22 20:52:04, Dylan Hatch wrote:
> Late module relocations are an issue on any arch that supports
> livepatch, so move the text_mutex locking to the livepatch core code.
> 
> Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> Acked-by: Song Liu <song@kernel.org>
> ---
>  arch/x86/kernel/module.c |  8 ++------
>  kernel/livepatch/core.c  | 18 +++++++++++++-----
>  2 files changed, 15 insertions(+), 11 deletions(-)
> 
> --- a/arch/x86/kernel/module.c
> +++ b/arch/x86/kernel/module.c
> @@ -197,18 +197,14 @@ static int write_relocate_add(Elf64_Shdr *sechdrs,
>  	bool early = me->state == MODULE_STATE_UNFORMED;
>  	void *(*write)(void *, const void *, size_t) = memcpy;
>  
> -	if (!early) {
> +	if (!early)
>  		write = text_poke;
> -		mutex_lock(&text_mutex);
> -	}
>  
>  	ret = __write_relocate_add(sechdrs, strtab, symindex, relsec, me,
>  				   write, apply);
>  
> -	if (!early) {
> +	if (!early)
>  		text_poke_sync();
> -		mutex_unlock(&text_mutex);
> -	}
>  
>  	return ret;
>  }
> diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> index 0e73fac55f8eb..9968441f73510 100644
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -319,12 +320,19 @@ static int klp_write_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
>  					  sec, sec_objname);
>  		if (ret)
>  			return ret;
> -
> -		return apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
>  	}
>  
> -	clear_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> -	return 0;
> +	if (!early)
> +		mutex_lock(&text_mutex);

I understand why you do this but it opens some questions.

As this patch suggests, the "text_mutex" has been used to
sychronize apply_relocate_add() only on x86_64 so far.

s390x seems to rely on "s390_kernel_write_lock" taken by:

  + apply_relocate_add()
    + s390_kernel_write()
      + __s390_kernel_write()

And powerpc seems to rely on "pte" locking taken by

  + apply_relocate_add()
    + patch_instruction()
      + patch_mem()
	+ __do_patch_mem_mm()
	  + get_locked_pte()

I see two possibilities:

  1. Either this change makes a false feeling that "text_mutex"
     sychronizes apply_relocate_add() on all architextures.

     This does not seems to be the case on, for example, s390
     and powerpc.

     => The code is misleading and could lead to troubles.


   2. Or it actually provides some sychronization on all
      architectures, for example, against kprobe code.

      In this case, it might actually fix an existing race.
      It should be described in the commit message
      and nominated for backporting to stable.


I am sorry if this has already been discussed. But I have been
in Cc only for v3 and v4. And there is no changelog in
the cover letter.

> +
> +	if (apply)
> +		ret = apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> +	else
> +		clear_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> +
> +	if (!early)
> +		mutex_unlock(&text_mutex);
> +	return ret;
>  }

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v4 1/2] livepatch, x86/module: Generalize late module relocation locking.
  2025-05-27 14:46   ` Petr Mladek
@ 2025-05-29  0:47     ` Dylan Hatch
  0 siblings, 0 replies; 6+ messages in thread
From: Dylan Hatch @ 2025-05-29  0:47 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Catalin Marinas, Will Deacon, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Josh Poimboeuf,
	Jiri Kosina, Miroslav Benes, Joe Lawrence, Song Liu,
	Ard Biesheuvel, Sami Tolvanen, Peter Zijlstra,
	Mike Rapoport (Microsoft), Andrew Morton, Dan Carpenter,
	linux-arm-kernel, linux-kernel, live-patching, Roman Gushchin,
	Toshiyuki Sato

Hi Petr,

On Tue, May 27, 2025 at 7:46 AM Petr Mladek <pmladek@suse.com> wrote:
>
> As this patch suggests, the "text_mutex" has been used to
> sychronize apply_relocate_add() only on x86_64 so far.
>
> s390x seems to rely on "s390_kernel_write_lock" taken by:
>
>   + apply_relocate_add()
>     + s390_kernel_write()
>       + __s390_kernel_write()
>
> And powerpc seems to rely on "pte" locking taken by
>
>   + apply_relocate_add()
>     + patch_instruction()
>       + patch_mem()
>         + __do_patch_mem_mm()
>           + get_locked_pte()
>

Reading through these implementations, I see that the serialization
happens only at the level of each individual write action. This is
equivalent to the patch_lock already used by aarch64_insn_copy() and
aarch64_insn_set(). I see now that this same serialization is achieved
by x86 using text_mutex, and that text_poke uses
'lockdep_assert_held(&text_mutex);' instead of grabbing the lock
itself, which is why only the x86 apply_relocate_add() currently takes
this mutex.

> I see two possibilities:
>
>   1. Either this change makes a false feeling that "text_mutex"
>      sychronizes apply_relocate_add() on all architextures.
>
>      This does not seems to be the case on, for example, s390
>      and powerpc.
>
>      => The code is misleading and could lead to troubles.
>

My original intent with this change was to give the late relocations
on arm64 the correct synchronization with respect to other
text-patching code. From what you've shown above, it looks like the
[PATCH 2/2] should work fine without this change since the arm64
patching code already takes patch_lock.

>
>    2. Or it actually provides some sychronization on all
>       architectures, for example, against kprobe code.
>
>       In this case, it might actually fix an existing race.
>       It should be described in the commit message
>       and nominated for backporting to stable.
>

I hadn't really considered this. From what I can tell, kprobe is the
only non-arch-specific code that takes this mutex when touching kernel
text. Though my understanding of kprobe is very limited, I think there
could be a risk due to the late relocations for livepatch:

Suppose I apply a livepatch 'L' that touches some module 'M', but M
isn't currently loaded. Between check_kprobe_address_safe() and
__register_kprobe(), I don't see any check that would fail for a probe
'P' registered on a function inside L. So it seems to me that it's
possible for prepare_kprobe() on L to race with apply_relocate_add()
for L if P is registered while M is being loaded.

Perhaps more importantly, is it ever safe to kprobe an instruction
that hasn't yet received relocation? This would probably only be
possible in the case of late relocations for a livepatch, so maybe
this scenario was overlooked. I wonder if check_kprobe_address_safe()
can check for this case and cause the kprobe to fail, preventing the
above race condition from ever being possible.

In any case, synchronizing against kprobe wasn't the original intent
of this patch series, so in my opinion it makes sense to resend it as
a standalone patch (if it is to be resent at all).

>
> I am sorry if this has already been discussed. But I have been
> in Cc only for v3 and v4. And there is no changelog in
> the cover letter.
>

This patch was added to the series in v3, which is how you got added
to CC. Sorry about not adding a changelog, I'm still learning the best
practices for sending patches.

> > +
> > +     if (apply)
> > +             ret = apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> > +     else
> > +             clear_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> > +
> > +     if (!early)
> > +             mutex_unlock(&text_mutex);
> > +     return ret;
> >  }
>
> Best Regards,
> Petr

Thanks,
Dylan

On Tue, May 27, 2025 at 7:46 AM Petr Mladek <pmladek@suse.com> wrote:
>
> On Thu 2025-05-22 20:52:04, Dylan Hatch wrote:
> > Late module relocations are an issue on any arch that supports
> > livepatch, so move the text_mutex locking to the livepatch core code.
> >
> > Signed-off-by: Dylan Hatch <dylanbhatch@google.com>
> > Acked-by: Song Liu <song@kernel.org>
> > ---
> >  arch/x86/kernel/module.c |  8 ++------
> >  kernel/livepatch/core.c  | 18 +++++++++++++-----
> >  2 files changed, 15 insertions(+), 11 deletions(-)
> >
> > --- a/arch/x86/kernel/module.c
> > +++ b/arch/x86/kernel/module.c
> > @@ -197,18 +197,14 @@ static int write_relocate_add(Elf64_Shdr *sechdrs,
> >       bool early = me->state == MODULE_STATE_UNFORMED;
> >       void *(*write)(void *, const void *, size_t) = memcpy;
> >
> > -     if (!early) {
> > +     if (!early)
> >               write = text_poke;
> > -             mutex_lock(&text_mutex);
> > -     }
> >
> >       ret = __write_relocate_add(sechdrs, strtab, symindex, relsec, me,
> >                                  write, apply);
> >
> > -     if (!early) {
> > +     if (!early)
> >               text_poke_sync();
> > -             mutex_unlock(&text_mutex);
> > -     }
> >
> >       return ret;
> >  }
> > diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
> > index 0e73fac55f8eb..9968441f73510 100644
> > --- a/kernel/livepatch/core.c
> > +++ b/kernel/livepatch/core.c
> > @@ -319,12 +320,19 @@ static int klp_write_section_relocs(struct module *pmod, Elf_Shdr *sechdrs,
> >                                         sec, sec_objname);
> >               if (ret)
> >                       return ret;
> > -
> > -             return apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> >       }
> >
> > -     clear_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> > -     return 0;
> > +     if (!early)
> > +             mutex_lock(&text_mutex);
>
> I understand why you do this but it opens some questions.
>
> As this patch suggests, the "text_mutex" has been used to
> sychronize apply_relocate_add() only on x86_64 so far.
>
> s390x seems to rely on "s390_kernel_write_lock" taken by:
>
>   + apply_relocate_add()
>     + s390_kernel_write()
>       + __s390_kernel_write()
>
> And powerpc seems to rely on "pte" locking taken by
>
>   + apply_relocate_add()
>     + patch_instruction()
>       + patch_mem()
>         + __do_patch_mem_mm()
>           + get_locked_pte()
>
> I see two possibilities:
>
>   1. Either this change makes a false feeling that "text_mutex"
>      sychronizes apply_relocate_add() on all architextures.
>
>      This does not seems to be the case on, for example, s390
>      and powerpc.
>
>      => The code is misleading and could lead to troubles.
>
>
>    2. Or it actually provides some sychronization on all
>       architectures, for example, against kprobe code.
>
>       In this case, it might actually fix an existing race.
>       It should be described in the commit message
>       and nominated for backporting to stable.
>
>
> I am sorry if this has already been discussed. But I have been
> in Cc only for v3 and v4. And there is no changelog in
> the cover letter.
>
> > +
> > +     if (apply)
> > +             ret = apply_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> > +     else
> > +             clear_relocate_add(sechdrs, strtab, symndx, secndx, pmod);
> > +
> > +     if (!early)
> > +             mutex_unlock(&text_mutex);
> > +     return ret;
> >  }
>
> Best Regards,
> Petr

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-05-29  0:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-22 20:52 [PATCH v4 0/2] livepatch, arm64/module: Enable late module relocations Dylan Hatch
2025-05-22 20:52 ` [PATCH v4 1/2] livepatch, x86/module: Generalize late module relocation locking Dylan Hatch
2025-05-27 14:46   ` Petr Mladek
2025-05-29  0:47     ` Dylan Hatch
2025-05-22 20:52 ` [PATCH v4 2/2] arm64/module: Use text-poke API for late relocations Dylan Hatch
2025-05-23  5:55 ` [PATCH v4 0/2] livepatch, arm64/module: Enable late module relocations Toshiyuki Sato (Fujitsu)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).