* [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c)
@ 2025-03-27 20:53 Ingo Molnar
2025-03-27 20:53 ` [PATCH 01/41] x86/alternatives: Rename 'struct bp_patching_desc' to 'struct int3_patching_desc' Ingo Molnar
` (41 more replies)
0 siblings, 42 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
This series has 3 main parts:
(1)
The first part of this series performs a thorough text-patching API namespace
cleanup discussed with Peter Zijlstra:
https://lore.kernel.org/r/20250325123119.GL36322@noisy.programming.kicks-ass.net
Non-SMP APIs retain their existing text_poke*() namespace:
text_poke()
text_poke_sync_each_cpu()
text_poke_kgdb()
text_poke_copy()
text_poke_copy
text_poke_copy_locked()
text_poke_set()
The SMP text-patching APIs had 3 separate prefixes:
text_poke_
text_poke_bp_
poke_int3_
These get standardized to the single text_poke_int3*() namespace:
text_poke_addr() => text_poke_int3_addr()
poke_int3_handler() => text_poke_int3_handler()
text_poke_bp_batch() => text_poke_int3_batch_process()
text_poke_loc_init() => text_poke_int3_loc_add()
text_poke_flush() => text_poke_int3_finish()
text_poke_finish() => text_poke_int3_flush()
text_poke_queue() => text_poke_int3_queue()
text_poke_bp() => text_poke_int3_now()
(2)
The second part of the series simplifies and standardizes the SMP batch-patching
data & types namespace, around the new tp_array* namespace:
int3_patching_desc => [removed]
temp_mm_state_t => [removed]
try_get_desc() => [removed]
put_desc() => [removed]
tp_vec,tp_vec_nr => tp_array
int3_refs => tp_array_refs
(3)
The third part of the series contains additional patches, that
together with the data-namespace simplification changes remove
about 3 layers of unnecessary indirections and simplify/streamline
various aspects of the code:
[PATCH] x86/alternatives: Remove the confusing, inaccurate & unnecessary 'temp_mm_state_t' abstraction
[PATCH] x86/alternatives: Use non-inverted logic instead of 'tp_order_fail()'
[PATCH] x86/alternatives: Remove the 'addr == NULL means forced-flush' hack from text_poke_int3_finish()/text_poke_int3_flush()/tp_addr_ordered()
[PATCH] x86/alternatives: Simplify text_poke_int3() by using tp_vec and existing APIs
[PATCH] x86/alternatives: Introduce 'struct text_poke_int3_array' and move tp_vec and tp_vec_nr to it
[PATCH] x86/alternatives: Remove the tp_vec indirection
[PATCH] x86/alternatives: Simplify try_get_tp_array()
[PATCH] x86/alternatives: Simplify text_poke_int3_handler()
[PATCH] x86/alternatives: Simplify text_poke_int3_batch()
[PATCH] x86/alternatives: Move the tp_array manipulation into text_poke_int3_loc_init() and rename it to text_poke_int3_loc_add()
[PATCH] x86/alternatives: Move tp_array completion from text_poke_int3_finish() and text_poke_int3_flush() to text_poke_int3_batch_process()
[PATCH] x86/alternatives: Simplify tp_addr_ordered()
Various APIs also had their names clarified, as part of the renames.
I also added comments where justified.
There's almost no functional changes in the end, other than
mixed text_poke_int3_now() & text_poke_int3_queue() calls
are now probably working better than before - although I'm not
aware of such in-tree usage at the moment.
After these changes there's a reduction of about ~20 lines of
code if we exclude comments, and some reduction in text size:
text data bss dec hex filename
13637 1009 4112 18758 4946 arch/x86/kernel/alternative.o.before
13549 1009 4156 18714 491a arch/x86/kernel/alternative.o.after
But the main goal was to perform a thorough round of source code TLC,
to make the code easier to read & maintain, and to remove a chunk
of technical debt accumulated incrementally over 20 years, which
improvements are only partly reflected in line count and code size decreases.
Lightly tested only.
This tree can also be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip WIP.x86/alternatives
Thanks,
Ingo
================>
Ingo Molnar (41):
x86/alternatives: Rename 'struct bp_patching_desc' to 'struct int3_patching_desc'
x86/alternatives: Rename 'bp_refs' to 'int3_refs'
x86/alternatives: Rename 'text_poke_bp_batch()' to 'text_poke_int3_batch()'
x86/alternatives: Rename 'text_poke_bp()' to 'text_poke_int3()'
x86/alternatives: Rename 'poke_int3_handler()' to 'text_poke_int3_handler()'
x86/alternatives: Rename 'poking_mm' to 'text_poke_mm'
x86/alternatives: Rename 'text_poke_addr' to 'text_poke_int3_addr'
x86/alternatives: Rename 'poking_addr' to 'text_poke_addr'
x86/alternatives: Rename 'bp_desc' to 'int3_desc'
x86/alternatives: Remove duplicate 'text_poke_early()' prototype
x86/alternatives: Update comments in int3_emulate_push()
x86/alternatives: Remove the confusing, inaccurate & unnecessary 'temp_mm_state_t' abstraction
x86/alternatives: Rename 'text_poke_flush()' to 'text_poke_int3_flush()'
x86/alternatives: Rename 'text_poke_finish()' to 'text_poke_int3_finish()'
x86/alternatives: Rename 'text_poke_queue()' to 'text_poke_int3_queue()'
x86/alternatives: Rename 'text_poke_loc_init()' to 'text_poke_int3_loc_init()'
x86/alternatives: Rename 'struct text_poke_loc' to 'struct text_poke_int3_loc'
x86/alternatives: Rename 'struct int3_patching_desc' to 'struct text_poke_int3_vec'
x86/alternatives: Rename 'int3_desc' to 'int3_vec'
x86/alternatives: Add text_mutex) assert to text_poke_int3_flush()
x86/alternatives: Assert that text_poke_int3_handler() can only ever handle 'tp_vec[]' based requests
x86/alternatives: Use non-inverted logic instead of 'tp_order_fail()'
x86/alternatives: Remove the 'addr == NULL means forced-flush' hack from text_poke_int3_finish()/text_poke_int3_flush()/tp_addr_ordered()
x86/alternatives: Simplify text_poke_int3() by using tp_vec and existing APIs
x86/alternatives: Assert input parameters in text_poke_int3_batch()
x86/alternatives: Introduce 'struct text_poke_int3_array' and move tp_vec and tp_vec_nr to it
x86/alternatives: Remove the tp_vec indirection
x86/alternatives: Rename 'try_get_desc()' to 'try_get_tp_array()'
x86/alternatives: Rename 'put_desc()' to 'put_tp_array()'
x86/alternatives: Simplify try_get_tp_array()
x86/alternatives: Simplify text_poke_int3_handler()
x86/alternatives: Simplify text_poke_int3_batch()
x86/alternatives: Rename 'text_poke_int3_batch()' to 'text_poke_int3_batch_process()'
x86/alternatives: Rename 'int3_refs' to 'tp_array_refs'
x86/alternatives: Move the tp_array manipulation into text_poke_int3_loc_init() and rename it to text_poke_int3_loc_add()
x86/alternatives: Remove the mixed-patching restriction on text_poke_int3()
x86/alternatives: Rename 'text_poke_int3()' to 'text_poke_int3_now()'
x86/alternatives: Add documentation for text_poke_int3_queue()
x86/alternatives: Move tp_array completion from text_poke_int3_finish() and text_poke_int3_flush() to text_poke_int3_batch_process()
x86/alternatives: Rename 'text_poke_sync()' to 'text_poke_sync_each_cpu()'
x86/alternatives: Simplify tp_addr_ordered()
arch/x86/include/asm/text-patching.h | 23 ++---
arch/x86/kernel/alternative.c | 255 ++++++++++++++++++++++++++++---------------------------
arch/x86/kernel/ftrace.c | 18 ++--
arch/x86/kernel/jump_label.c | 6 +-
arch/x86/kernel/kprobes/core.c | 4 +-
arch/x86/kernel/kprobes/opt.c | 6 +-
arch/x86/kernel/module.c | 2 +-
arch/x86/kernel/static_call.c | 2 +-
arch/x86/kernel/traps.c | 6 +-
arch/x86/mm/init.c | 16 ++--
arch/x86/net/bpf_jit_comp.c | 2 +-
11 files changed, 172 insertions(+), 168 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 01/41] x86/alternatives: Rename 'struct bp_patching_desc' to 'struct int3_patching_desc'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 02/41] x86/alternatives: Rename 'bp_refs' to 'int3_refs' Ingo Molnar
` (40 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 5f448142aa99..4e932e95c744 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2471,17 +2471,17 @@ struct text_poke_loc {
u8 old;
};
-struct bp_patching_desc {
+struct int3_patching_desc {
struct text_poke_loc *vec;
int nr_entries;
};
static DEFINE_PER_CPU(atomic_t, bp_refs);
-static struct bp_patching_desc bp_desc;
+static struct int3_patching_desc bp_desc;
static __always_inline
-struct bp_patching_desc *try_get_desc(void)
+struct int3_patching_desc *try_get_desc(void)
{
atomic_t *refs = this_cpu_ptr(&bp_refs);
@@ -2517,7 +2517,7 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
noinstr int poke_int3_handler(struct pt_regs *regs)
{
- struct bp_patching_desc *desc;
+ struct int3_patching_desc *desc;
struct text_poke_loc *tp;
int ret = 0;
void *ip;
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 02/41] x86/alternatives: Rename 'bp_refs' to 'int3_refs'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
2025-03-27 20:53 ` [PATCH 01/41] x86/alternatives: Rename 'struct bp_patching_desc' to 'struct int3_patching_desc' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 03/41] x86/alternatives: Rename 'text_poke_bp_batch()' to 'text_poke_int3_batch()' Ingo Molnar
` (39 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 4e932e95c744..cb9ac69694fb 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2476,14 +2476,14 @@ struct int3_patching_desc {
int nr_entries;
};
-static DEFINE_PER_CPU(atomic_t, bp_refs);
+static DEFINE_PER_CPU(atomic_t, int3_refs);
static struct int3_patching_desc bp_desc;
static __always_inline
struct int3_patching_desc *try_get_desc(void)
{
- atomic_t *refs = this_cpu_ptr(&bp_refs);
+ atomic_t *refs = this_cpu_ptr(&int3_refs);
if (!raw_atomic_inc_not_zero(refs))
return NULL;
@@ -2493,7 +2493,7 @@ struct int3_patching_desc *try_get_desc(void)
static __always_inline void put_desc(void)
{
- atomic_t *refs = this_cpu_ptr(&bp_refs);
+ atomic_t *refs = this_cpu_ptr(&int3_refs);
smp_mb__before_atomic();
raw_atomic_dec(refs);
@@ -2529,9 +2529,9 @@ noinstr int poke_int3_handler(struct pt_regs *regs)
* Having observed our INT3 instruction, we now must observe
* bp_desc with non-zero refcount:
*
- * bp_refs = 1 INT3
+ * int3_refs = 1 INT3
* WMB RMB
- * write INT3 if (bp_refs != 0)
+ * write INT3 if (int3_refs != 0)
*/
smp_rmb();
@@ -2638,7 +2638,7 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries
* ensure reading a non-zero refcount provides up to date bp_desc data.
*/
for_each_possible_cpu(i)
- atomic_set_release(per_cpu_ptr(&bp_refs, i), 1);
+ atomic_set_release(per_cpu_ptr(&int3_refs, i), 1);
/*
* Function tracing can enable thousands of places that need to be
@@ -2760,7 +2760,7 @@ static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries
* unused.
*/
for_each_possible_cpu(i) {
- atomic_t *refs = per_cpu_ptr(&bp_refs, i);
+ atomic_t *refs = per_cpu_ptr(&int3_refs, i);
if (unlikely(!atomic_dec_and_test(refs)))
atomic_cond_read_acquire(refs, !VAL);
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 03/41] x86/alternatives: Rename 'text_poke_bp_batch()' to 'text_poke_int3_batch()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
2025-03-27 20:53 ` [PATCH 01/41] x86/alternatives: Rename 'struct bp_patching_desc' to 'struct int3_patching_desc' Ingo Molnar
2025-03-27 20:53 ` [PATCH 02/41] x86/alternatives: Rename 'bp_refs' to 'int3_refs' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 04/41] x86/alternatives: Rename 'text_poke_bp()' to 'text_poke_int3()' Ingo Molnar
` (38 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index cb9ac69694fb..b0fa770b7460 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2467,7 +2467,7 @@ struct text_poke_loc {
u8 len;
u8 opcode;
const u8 text[POKE_MAX_OPCODE_SIZE];
- /* see text_poke_bp_batch() */
+ /* see text_poke_int3_batch() */
u8 old;
};
@@ -2540,7 +2540,7 @@ noinstr int poke_int3_handler(struct pt_regs *regs)
return 0;
/*
- * Discount the INT3. See text_poke_bp_batch().
+ * Discount the INT3. See text_poke_int3_batch().
*/
ip = (void *) regs->ip - INT3_INSN_SIZE;
@@ -2602,7 +2602,7 @@ static struct text_poke_loc tp_vec[TP_VEC_MAX];
static int tp_vec_nr;
/**
- * text_poke_bp_batch() -- update instructions on live kernel on SMP
+ * text_poke_int3_batch() -- update instructions on live kernel on SMP
* @tp: vector of instructions to patch
* @nr_entries: number of entries in the vector
*
@@ -2622,7 +2622,7 @@ static int tp_vec_nr;
* replacing opcode
* - sync cores
*/
-static void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries)
+static void text_poke_int3_batch(struct text_poke_loc *tp, unsigned int nr_entries)
{
unsigned char int3 = INT3_INSN_OPCODE;
unsigned int i;
@@ -2866,7 +2866,7 @@ static bool tp_order_fail(void *addr)
static void text_poke_flush(void *addr)
{
if (tp_vec_nr == TP_VEC_MAX || tp_order_fail(addr)) {
- text_poke_bp_batch(tp_vec, tp_vec_nr);
+ text_poke_int3_batch(tp_vec, tp_vec_nr);
tp_vec_nr = 0;
}
}
@@ -2902,5 +2902,5 @@ void __ref text_poke_bp(void *addr, const void *opcode, size_t len, const void *
struct text_poke_loc tp;
text_poke_loc_init(&tp, addr, opcode, len, emulate);
- text_poke_bp_batch(&tp, 1);
+ text_poke_int3_batch(&tp, 1);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 04/41] x86/alternatives: Rename 'text_poke_bp()' to 'text_poke_int3()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (2 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 03/41] x86/alternatives: Rename 'text_poke_bp_batch()' to 'text_poke_int3_batch()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 05/41] x86/alternatives: Rename 'poke_int3_handler()' to 'text_poke_int3_handler()' Ingo Molnar
` (37 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/text-patching.h | 2 +-
arch/x86/kernel/alternative.c | 4 ++--
arch/x86/kernel/ftrace.c | 8 ++++----
arch/x86/kernel/jump_label.c | 2 +-
arch/x86/kernel/kprobes/opt.c | 2 +-
arch/x86/kernel/static_call.c | 2 +-
arch/x86/net/bpf_jit_comp.c | 2 +-
7 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index ab9e143ec9fe..944b2aad4351 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -39,7 +39,7 @@ extern void *text_poke_copy(void *addr, const void *opcode, size_t len);
extern void *text_poke_copy_locked(void *addr, const void *opcode, size_t len, bool core_ok);
extern void *text_poke_set(void *addr, int c, size_t len);
extern int poke_int3_handler(struct pt_regs *regs);
-extern void text_poke_bp(void *addr, const void *opcode, size_t len, const void *emulate);
+extern void text_poke_int3(void *addr, const void *opcode, size_t len, const void *emulate);
extern void text_poke_queue(void *addr, const void *opcode, size_t len, const void *emulate);
extern void text_poke_finish(void);
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index b0fa770b7460..661cb6b1fbc3 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2887,7 +2887,7 @@ void __ref text_poke_queue(void *addr, const void *opcode, size_t len, const voi
}
/**
- * text_poke_bp() -- update instructions on live kernel on SMP
+ * text_poke_int3() -- update instructions on live kernel on SMP
* @addr: address to patch
* @opcode: opcode of new instruction
* @len: length to copy
@@ -2897,7 +2897,7 @@ void __ref text_poke_queue(void *addr, const void *opcode, size_t len, const voi
* dynamically allocated memory. This function should be used when it is
* not possible to allocate memory.
*/
-void __ref text_poke_bp(void *addr, const void *opcode, size_t len, const void *emulate)
+void __ref text_poke_int3(void *addr, const void *opcode, size_t len, const void *emulate)
{
struct text_poke_loc tp;
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index cace6e8d7cc7..4e284ff674f1 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -186,11 +186,11 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
ip = (unsigned long)(&ftrace_call);
new = ftrace_call_replace(ip, (unsigned long)func);
- text_poke_bp((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
+ text_poke_int3((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
ip = (unsigned long)(&ftrace_regs_call);
new = ftrace_call_replace(ip, (unsigned long)func);
- text_poke_bp((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
+ text_poke_int3((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
return 0;
}
@@ -492,7 +492,7 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
mutex_lock(&text_mutex);
/* Do a safe modify in case the trampoline is executing */
new = ftrace_call_replace(ip, (unsigned long)func);
- text_poke_bp((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
+ text_poke_int3((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
mutex_unlock(&text_mutex);
}
@@ -586,7 +586,7 @@ static int ftrace_mod_jmp(unsigned long ip, void *func)
const char *new;
new = ftrace_jmp_replace(ip, (unsigned long)func);
- text_poke_bp((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
+ text_poke_int3((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
return 0;
}
diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
index f5b8ef02d172..94e2dcc94d9d 100644
--- a/arch/x86/kernel/jump_label.c
+++ b/arch/x86/kernel/jump_label.c
@@ -102,7 +102,7 @@ __jump_label_transform(struct jump_entry *entry,
return;
}
- text_poke_bp((void *)jump_entry_code(entry), jlp.code, jlp.size, NULL);
+ text_poke_int3((void *)jump_entry_code(entry), jlp.code, jlp.size, NULL);
}
static void __ref jump_label_transform(struct jump_entry *entry,
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 36d6809c6c9e..e13d4a2d9244 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -488,7 +488,7 @@ void arch_optimize_kprobes(struct list_head *oplist)
insn_buff[0] = JMP32_INSN_OPCODE;
*(s32 *)(&insn_buff[1]) = rel;
- text_poke_bp(op->kp.addr, insn_buff, JMP32_INSN_SIZE, NULL);
+ text_poke_int3(op->kp.addr, insn_buff, JMP32_INSN_SIZE, NULL);
list_del_init(&op->list);
}
diff --git a/arch/x86/kernel/static_call.c b/arch/x86/kernel/static_call.c
index 9e51242ed125..3331a7c90b9a 100644
--- a/arch/x86/kernel/static_call.c
+++ b/arch/x86/kernel/static_call.c
@@ -108,7 +108,7 @@ static void __ref __static_call_transform(void *insn, enum insn_type type,
if (system_state == SYSTEM_BOOTING || modinit)
return text_poke_early(insn, code, size);
- text_poke_bp(insn, code, size, emulate);
+ text_poke_int3(insn, code, size, emulate);
}
static void __static_call_validate(u8 *insn, bool tail, bool tramp)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 72776dcb75aa..1e2a4b7a6b73 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -629,7 +629,7 @@ static int __bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
goto out;
ret = 1;
if (memcmp(ip, new_insn, X86_PATCH_SIZE)) {
- text_poke_bp(ip, new_insn, X86_PATCH_SIZE, NULL);
+ text_poke_int3(ip, new_insn, X86_PATCH_SIZE, NULL);
ret = 0;
}
out:
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 05/41] x86/alternatives: Rename 'poke_int3_handler()' to 'text_poke_int3_handler()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (3 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 04/41] x86/alternatives: Rename 'text_poke_bp()' to 'text_poke_int3()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 06/41] x86/alternatives: Rename 'poking_mm' to 'text_poke_mm' Ingo Molnar
` (36 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
All related functions in this subsystem already have a
text_poke_int3_ prefix - add it to the trap handler
as well.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/text-patching.h | 2 +-
arch/x86/kernel/alternative.c | 2 +-
arch/x86/kernel/traps.c | 6 +++---
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index 944b2aad4351..fbca0bd725b6 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -38,7 +38,7 @@ extern void *text_poke_copy(void *addr, const void *opcode, size_t len);
#define text_poke_copy text_poke_copy
extern void *text_poke_copy_locked(void *addr, const void *opcode, size_t len, bool core_ok);
extern void *text_poke_set(void *addr, int c, size_t len);
-extern int poke_int3_handler(struct pt_regs *regs);
+extern int text_poke_int3_handler(struct pt_regs *regs);
extern void text_poke_int3(void *addr, const void *opcode, size_t len, const void *emulate);
extern void text_poke_queue(void *addr, const void *opcode, size_t len, const void *emulate);
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 661cb6b1fbc3..5d410c97d451 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2515,7 +2515,7 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
return 0;
}
-noinstr int poke_int3_handler(struct pt_regs *regs)
+noinstr int text_poke_int3_handler(struct pt_regs *regs)
{
struct int3_patching_desc *desc;
struct text_poke_loc *tp;
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 9f88b8a78e50..b4c7bfb06ea1 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -882,16 +882,16 @@ static void do_int3_user(struct pt_regs *regs)
DEFINE_IDTENTRY_RAW(exc_int3)
{
/*
- * poke_int3_handler() is completely self contained code; it does (and
+ * text_poke_int3_handler() is completely self contained code; it does (and
* must) *NOT* call out to anything, lest it hits upon yet another
* INT3.
*/
- if (poke_int3_handler(regs))
+ if (text_poke_int3_handler(regs))
return;
/*
* irqentry_enter_from_user_mode() uses static_branch_{,un}likely()
- * and therefore can trigger INT3, hence poke_int3_handler() must
+ * and therefore can trigger INT3, hence text_poke_int3_handler() must
* be done before. If the entry came from kernel mode, then use
* nmi_enter() because the INT3 could have been hit in any context
* including NMI.
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 06/41] x86/alternatives: Rename 'poking_mm' to 'text_poke_mm'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (4 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 05/41] x86/alternatives: Rename 'poke_int3_handler()' to 'text_poke_int3_handler()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 07/41] x86/alternatives: Rename 'text_poke_addr' to 'text_poke_int3_addr' Ingo Molnar
` (35 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Put it into the text_poke_* namespace of <asm/text-patching.h>.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/text-patching.h | 2 +-
arch/x86/kernel/alternative.c | 18 +++++++++---------
arch/x86/mm/init.c | 8 ++++----
3 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index fbca0bd725b6..e41ea3040680 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -128,7 +128,7 @@ void *text_gen_insn(u8 opcode, const void *addr, const void *dest)
}
extern int after_bootmem;
-extern __ro_after_init struct mm_struct *poking_mm;
+extern __ro_after_init struct mm_struct *text_poke_mm;
extern __ro_after_init unsigned long poking_addr;
#ifndef CONFIG_UML_X86
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 5d410c97d451..f4baeeaa6c0c 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2191,7 +2191,7 @@ static inline temp_mm_state_t use_temporary_mm(struct mm_struct *mm)
return temp_state;
}
-__ro_after_init struct mm_struct *poking_mm;
+__ro_after_init struct mm_struct *text_poke_mm;
__ro_after_init unsigned long poking_addr;
static inline void unuse_temporary_mm(temp_mm_state_t prev_state)
@@ -2201,7 +2201,7 @@ static inline void unuse_temporary_mm(temp_mm_state_t prev_state)
switch_mm_irqs_off(NULL, prev_state.mm, current);
/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
- cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(poking_mm));
+ cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(text_poke_mm));
/*
* Restore the breakpoints if they were disabled before the temporary mm
@@ -2266,7 +2266,7 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
/*
* The lock is not really needed, but this allows to avoid open-coding.
*/
- ptep = get_locked_pte(poking_mm, poking_addr, &ptl);
+ ptep = get_locked_pte(text_poke_mm, poking_addr, &ptl);
/*
* This must not fail; preallocated in poking_init().
@@ -2276,18 +2276,18 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
local_irq_save(flags);
pte = mk_pte(pages[0], pgprot);
- set_pte_at(poking_mm, poking_addr, ptep, pte);
+ set_pte_at(text_poke_mm, poking_addr, ptep, pte);
if (cross_page_boundary) {
pte = mk_pte(pages[1], pgprot);
- set_pte_at(poking_mm, poking_addr + PAGE_SIZE, ptep + 1, pte);
+ set_pte_at(text_poke_mm, poking_addr + PAGE_SIZE, ptep + 1, pte);
}
/*
* Loading the temporary mm behaves as a compiler barrier, which
* guarantees that the PTE will be set at the time memcpy() is done.
*/
- prev = use_temporary_mm(poking_mm);
+ prev = use_temporary_mm(text_poke_mm);
kasan_disable_current();
func((u8 *)poking_addr + offset_in_page(addr), src, len);
@@ -2299,9 +2299,9 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
*/
barrier();
- pte_clear(poking_mm, poking_addr, ptep);
+ pte_clear(text_poke_mm, poking_addr, ptep);
if (cross_page_boundary)
- pte_clear(poking_mm, poking_addr + PAGE_SIZE, ptep + 1);
+ pte_clear(text_poke_mm, poking_addr + PAGE_SIZE, ptep + 1);
/*
* Loading the previous page-table hierarchy requires a serializing
@@ -2314,7 +2314,7 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
* Flushing the TLB might involve IPIs, which would require enabled
* IRQs, but not if the mm is not used, as it is in this point.
*/
- flush_tlb_mm_range(poking_mm, poking_addr, poking_addr +
+ flush_tlb_mm_range(text_poke_mm, poking_addr, poking_addr +
(cross_page_boundary ? 2 : 1) * PAGE_SIZE,
PAGE_SHIFT, false);
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index bfa444a7dbb0..84b52a1ebd48 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -824,11 +824,11 @@ void __init poking_init(void)
spinlock_t *ptl;
pte_t *ptep;
- poking_mm = mm_alloc();
- BUG_ON(!poking_mm);
+ text_poke_mm = mm_alloc();
+ BUG_ON(!text_poke_mm);
/* Xen PV guests need the PGD to be pinned. */
- paravirt_enter_mmap(poking_mm);
+ paravirt_enter_mmap(text_poke_mm);
/*
* Randomize the poking address, but make sure that the following page
@@ -848,7 +848,7 @@ void __init poking_init(void)
* needed for poking now. Later, poking may be performed in an atomic
* section, which might cause allocation to fail.
*/
- ptep = get_locked_pte(poking_mm, poking_addr, &ptl);
+ ptep = get_locked_pte(text_poke_mm, poking_addr, &ptl);
BUG_ON(!ptep);
pte_unmap_unlock(ptep, ptl);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 07/41] x86/alternatives: Rename 'text_poke_addr' to 'text_poke_int3_addr'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (5 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 06/41] x86/alternatives: Rename 'poking_mm' to 'text_poke_mm' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 08/41] x86/alternatives: Rename 'poking_addr' to 'text_poke_addr' Ingo Molnar
` (34 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index f4baeeaa6c0c..100b606d7dde 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2499,7 +2499,7 @@ static __always_inline void put_desc(void)
raw_atomic_dec(refs);
}
-static __always_inline void *text_poke_addr(struct text_poke_loc *tp)
+static __always_inline void *text_poke_int3_addr(struct text_poke_loc *tp)
{
return _stext + tp->rel_addr;
}
@@ -2508,9 +2508,9 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
{
struct text_poke_loc *tp = (struct text_poke_loc *) elt;
- if (key < text_poke_addr(tp))
+ if (key < text_poke_int3_addr(tp))
return -1;
- if (key > text_poke_addr(tp))
+ if (key > text_poke_int3_addr(tp))
return 1;
return 0;
}
@@ -2555,7 +2555,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
goto out_put;
} else {
tp = desc->vec;
- if (text_poke_addr(tp) != ip)
+ if (text_poke_int3_addr(tp) != ip)
goto out_put;
}
@@ -2660,8 +2660,8 @@ static void text_poke_int3_batch(struct text_poke_loc *tp, unsigned int nr_entri
* First step: add a int3 trap to the address that will be patched.
*/
for (i = 0; i < nr_entries; i++) {
- tp[i].old = *(u8 *)text_poke_addr(&tp[i]);
- text_poke(text_poke_addr(&tp[i]), &int3, INT3_INSN_SIZE);
+ tp[i].old = *(u8 *)text_poke_int3_addr(&tp[i]);
+ text_poke(text_poke_int3_addr(&tp[i]), &int3, INT3_INSN_SIZE);
}
text_poke_sync();
@@ -2677,7 +2677,7 @@ static void text_poke_int3_batch(struct text_poke_loc *tp, unsigned int nr_entri
if (len - INT3_INSN_SIZE > 0) {
memcpy(old + INT3_INSN_SIZE,
- text_poke_addr(&tp[i]) + INT3_INSN_SIZE,
+ text_poke_int3_addr(&tp[i]) + INT3_INSN_SIZE,
len - INT3_INSN_SIZE);
if (len == 6) {
@@ -2686,7 +2686,7 @@ static void text_poke_int3_batch(struct text_poke_loc *tp, unsigned int nr_entri
new = _new;
}
- text_poke(text_poke_addr(&tp[i]) + INT3_INSN_SIZE,
+ text_poke(text_poke_int3_addr(&tp[i]) + INT3_INSN_SIZE,
new + INT3_INSN_SIZE,
len - INT3_INSN_SIZE);
@@ -2717,7 +2717,7 @@ static void text_poke_int3_batch(struct text_poke_loc *tp, unsigned int nr_entri
* The old instruction is recorded so that the event can be
* processed forwards or backwards.
*/
- perf_event_text_poke(text_poke_addr(&tp[i]), old, len, new, len);
+ perf_event_text_poke(text_poke_int3_addr(&tp[i]), old, len, new, len);
}
if (do_sync) {
@@ -2742,7 +2742,7 @@ static void text_poke_int3_batch(struct text_poke_loc *tp, unsigned int nr_entri
if (byte == INT3_INSN_OPCODE)
continue;
- text_poke(text_poke_addr(&tp[i]), &byte, INT3_INSN_SIZE);
+ text_poke(text_poke_int3_addr(&tp[i]), &byte, INT3_INSN_SIZE);
do_sync++;
}
@@ -2857,7 +2857,7 @@ static bool tp_order_fail(void *addr)
return true;
tp = &tp_vec[tp_vec_nr - 1];
- if ((unsigned long)text_poke_addr(tp) > (unsigned long)addr)
+ if ((unsigned long)text_poke_int3_addr(tp) > (unsigned long)addr)
return true;
return false;
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 08/41] x86/alternatives: Rename 'poking_addr' to 'text_poke_addr'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (6 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 07/41] x86/alternatives: Rename 'text_poke_addr' to 'text_poke_int3_addr' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 09/41] x86/alternatives: Rename 'bp_desc' to 'int3_desc' Ingo Molnar
` (33 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Put it into the text_poke_* namespace of <asm/text-patching.h>.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/text-patching.h | 2 +-
arch/x86/kernel/alternative.c | 16 ++++++++--------
arch/x86/mm/init.c | 10 +++++-----
3 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index e41ea3040680..9f6f011f5696 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -129,7 +129,7 @@ void *text_gen_insn(u8 opcode, const void *addr, const void *dest)
extern int after_bootmem;
extern __ro_after_init struct mm_struct *text_poke_mm;
-extern __ro_after_init unsigned long poking_addr;
+extern __ro_after_init unsigned long text_poke_addr;
#ifndef CONFIG_UML_X86
static __always_inline
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 100b606d7dde..8c4bfb6d9a95 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2192,7 +2192,7 @@ static inline temp_mm_state_t use_temporary_mm(struct mm_struct *mm)
}
__ro_after_init struct mm_struct *text_poke_mm;
-__ro_after_init unsigned long poking_addr;
+__ro_after_init unsigned long text_poke_addr;
static inline void unuse_temporary_mm(temp_mm_state_t prev_state)
{
@@ -2266,7 +2266,7 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
/*
* The lock is not really needed, but this allows to avoid open-coding.
*/
- ptep = get_locked_pte(text_poke_mm, poking_addr, &ptl);
+ ptep = get_locked_pte(text_poke_mm, text_poke_addr, &ptl);
/*
* This must not fail; preallocated in poking_init().
@@ -2276,11 +2276,11 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
local_irq_save(flags);
pte = mk_pte(pages[0], pgprot);
- set_pte_at(text_poke_mm, poking_addr, ptep, pte);
+ set_pte_at(text_poke_mm, text_poke_addr, ptep, pte);
if (cross_page_boundary) {
pte = mk_pte(pages[1], pgprot);
- set_pte_at(text_poke_mm, poking_addr + PAGE_SIZE, ptep + 1, pte);
+ set_pte_at(text_poke_mm, text_poke_addr + PAGE_SIZE, ptep + 1, pte);
}
/*
@@ -2290,7 +2290,7 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
prev = use_temporary_mm(text_poke_mm);
kasan_disable_current();
- func((u8 *)poking_addr + offset_in_page(addr), src, len);
+ func((u8 *)text_poke_addr + offset_in_page(addr), src, len);
kasan_enable_current();
/*
@@ -2299,9 +2299,9 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
*/
barrier();
- pte_clear(text_poke_mm, poking_addr, ptep);
+ pte_clear(text_poke_mm, text_poke_addr, ptep);
if (cross_page_boundary)
- pte_clear(text_poke_mm, poking_addr + PAGE_SIZE, ptep + 1);
+ pte_clear(text_poke_mm, text_poke_addr + PAGE_SIZE, ptep + 1);
/*
* Loading the previous page-table hierarchy requires a serializing
@@ -2314,7 +2314,7 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
* Flushing the TLB might involve IPIs, which would require enabled
* IRQs, but not if the mm is not used, as it is in this point.
*/
- flush_tlb_mm_range(text_poke_mm, poking_addr, poking_addr +
+ flush_tlb_mm_range(text_poke_mm, text_poke_addr, text_poke_addr +
(cross_page_boundary ? 2 : 1) * PAGE_SIZE,
PAGE_SHIFT, false);
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 84b52a1ebd48..031741912981 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -835,20 +835,20 @@ void __init poking_init(void)
* will be mapped at the same PMD. We need 2 pages, so find space for 3,
* and adjust the address if the PMD ends after the first one.
*/
- poking_addr = TASK_UNMAPPED_BASE;
+ text_poke_addr = TASK_UNMAPPED_BASE;
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
- poking_addr += (kaslr_get_random_long("Poking") & PAGE_MASK) %
+ text_poke_addr += (kaslr_get_random_long("Poking") & PAGE_MASK) %
(TASK_SIZE - TASK_UNMAPPED_BASE - 3 * PAGE_SIZE);
- if (((poking_addr + PAGE_SIZE) & ~PMD_MASK) == 0)
- poking_addr += PAGE_SIZE;
+ if (((text_poke_addr + PAGE_SIZE) & ~PMD_MASK) == 0)
+ text_poke_addr += PAGE_SIZE;
/*
* We need to trigger the allocation of the page-tables that will be
* needed for poking now. Later, poking may be performed in an atomic
* section, which might cause allocation to fail.
*/
- ptep = get_locked_pte(text_poke_mm, poking_addr, &ptl);
+ ptep = get_locked_pte(text_poke_mm, text_poke_addr, &ptl);
BUG_ON(!ptep);
pte_unmap_unlock(ptep, ptl);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 09/41] x86/alternatives: Rename 'bp_desc' to 'int3_desc'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (7 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 08/41] x86/alternatives: Rename 'poking_addr' to 'text_poke_addr' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 10/41] x86/alternatives: Remove duplicate 'text_poke_early()' prototype Ingo Molnar
` (32 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 8c4bfb6d9a95..44b8e2826808 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2478,7 +2478,7 @@ struct int3_patching_desc {
static DEFINE_PER_CPU(atomic_t, int3_refs);
-static struct int3_patching_desc bp_desc;
+static struct int3_patching_desc int3_desc;
static __always_inline
struct int3_patching_desc *try_get_desc(void)
@@ -2488,7 +2488,7 @@ struct int3_patching_desc *try_get_desc(void)
if (!raw_atomic_inc_not_zero(refs))
return NULL;
- return &bp_desc;
+ return &int3_desc;
}
static __always_inline void put_desc(void)
@@ -2527,7 +2527,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
/*
* Having observed our INT3 instruction, we now must observe
- * bp_desc with non-zero refcount:
+ * int3_desc with non-zero refcount:
*
* int3_refs = 1 INT3
* WMB RMB
@@ -2630,12 +2630,12 @@ static void text_poke_int3_batch(struct text_poke_loc *tp, unsigned int nr_entri
lockdep_assert_held(&text_mutex);
- bp_desc.vec = tp;
- bp_desc.nr_entries = nr_entries;
+ int3_desc.vec = tp;
+ int3_desc.nr_entries = nr_entries;
/*
* Corresponds to the implicit memory barrier in try_get_desc() to
- * ensure reading a non-zero refcount provides up to date bp_desc data.
+ * ensure reading a non-zero refcount provides up to date int3_desc data.
*/
for_each_possible_cpu(i)
atomic_set_release(per_cpu_ptr(&int3_refs, i), 1);
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 10/41] x86/alternatives: Remove duplicate 'text_poke_early()' prototype
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (8 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 09/41] x86/alternatives: Rename 'bp_desc' to 'int3_desc' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 11/41] x86/alternatives: Update comments in int3_emulate_push() Ingo Molnar
` (31 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
It's declared in <asm/text-patching.h> already.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 44b8e2826808..7d14c8abd3aa 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -176,7 +176,6 @@ extern s32 __return_sites[], __return_sites_end[];
extern s32 __cfi_sites[], __cfi_sites_end[];
extern s32 __ibt_endbr_seal[], __ibt_endbr_seal_end[];
extern s32 __smp_locks[], __smp_locks_end[];
-void text_poke_early(void *addr, const void *opcode, size_t len);
/*
* Matches NOP and NOPL, not any of the other possible NOPs.
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 11/41] x86/alternatives: Update comments in int3_emulate_push()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (9 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 10/41] x86/alternatives: Remove duplicate 'text_poke_early()' prototype Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 12/41] x86/alternatives: Remove the confusing, inaccurate & unnecessary 'temp_mm_state_t' abstraction Ingo Molnar
` (30 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
The idtentry macro in entry_64.S hasn't had a create_gap
option for 5 years - update the comment.
(Also clean up the entire comment block while at it.)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/text-patching.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index 9f6f011f5696..c2dbb0e4d80d 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -142,13 +142,14 @@ static __always_inline
void int3_emulate_push(struct pt_regs *regs, unsigned long val)
{
/*
- * The int3 handler in entry_64.S adds a gap between the
+ * The INT3 handler in entry_64.S adds a gap between the
* stack where the break point happened, and the saving of
* pt_regs. We can extend the original stack because of
- * this gap. See the idtentry macro's create_gap option.
+ * this gap. See the idtentry macro's X86_TRAP_BP logic.
*
- * Similarly entry_32.S will have a gap on the stack for (any) hardware
- * exception and pt_regs; see FIXUP_FRAME.
+ * Similarly, entry_32.S will have a gap on the stack for
+ * (any) hardware exception and pt_regs; see the
+ * FIXUP_FRAME macro.
*/
regs->sp -= sizeof(unsigned long);
*(unsigned long *)regs->sp = val;
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 12/41] x86/alternatives: Remove the confusing, inaccurate & unnecessary 'temp_mm_state_t' abstraction
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (10 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 11/41] x86/alternatives: Update comments in int3_emulate_push() Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 13/41] x86/alternatives: Rename 'text_poke_flush()' to 'text_poke_int3_flush()' Ingo Molnar
` (29 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
So the temp_mm_state_t abstraction used by use_temporary_mm() and
unuse_temporary_mm() is super confusing:
- The whole machinery is about temporarily switching to the
text_poke_mm utility MM that got allocated during bootup
for text-patching purposes alone:
temp_mm_state_t prev;
/*
* Loading the temporary mm behaves as a compiler barrier, which
* guarantees that the PTE will be set at the time memcpy() is done.
*/
prev = use_temporary_mm(text_poke_mm);
- Yet the value that gets saved in the temp_mm_state_t variable
is not the temporary MM ... but the previous MM...
- Ie. we temporarily put the non-temporary MM into a variable
that has the temp_mm_state_t type. This makes no sense whatsoever.
- The confusion continues in unuse_temporary_mm():
static inline void unuse_temporary_mm(temp_mm_state_t prev_state)
Here we unuse an MM that is ... not the temporary MM, but the
previous MM. :-/
Fix up all this confusion by removing the unnecessary layer of
abstraction and using a bog-standard 'struct mm_struct *prev_mm'
variable to save the MM to.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 24 ++++++++++--------------
1 file changed, 10 insertions(+), 14 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 7d14c8abd3aa..557ee2546177 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2139,10 +2139,6 @@ void __init_or_module text_poke_early(void *addr, const void *opcode,
}
}
-typedef struct {
- struct mm_struct *mm;
-} temp_mm_state_t;
-
/*
* Using a temporary mm allows to set temporary mappings that are not accessible
* by other CPUs. Such mappings are needed to perform sensitive memory writes
@@ -2156,9 +2152,9 @@ typedef struct {
* loaded, thereby preventing interrupt handler bugs from overriding
* the kernel memory protection.
*/
-static inline temp_mm_state_t use_temporary_mm(struct mm_struct *mm)
+static inline struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
{
- temp_mm_state_t temp_state;
+ struct mm_struct *prev_mm;
lockdep_assert_irqs_disabled();
@@ -2170,8 +2166,8 @@ static inline temp_mm_state_t use_temporary_mm(struct mm_struct *mm)
if (this_cpu_read(cpu_tlbstate_shared.is_lazy))
leave_mm();
- temp_state.mm = this_cpu_read(cpu_tlbstate.loaded_mm);
- switch_mm_irqs_off(NULL, mm, current);
+ prev_mm = this_cpu_read(cpu_tlbstate.loaded_mm);
+ switch_mm_irqs_off(NULL, temp_mm, current);
/*
* If breakpoints are enabled, disable them while the temporary mm is
@@ -2187,17 +2183,17 @@ static inline temp_mm_state_t use_temporary_mm(struct mm_struct *mm)
if (hw_breakpoint_active())
hw_breakpoint_disable();
- return temp_state;
+ return prev_mm;
}
__ro_after_init struct mm_struct *text_poke_mm;
__ro_after_init unsigned long text_poke_addr;
-static inline void unuse_temporary_mm(temp_mm_state_t prev_state)
+static inline void unuse_temporary_mm(struct mm_struct *prev_mm)
{
lockdep_assert_irqs_disabled();
- switch_mm_irqs_off(NULL, prev_state.mm, current);
+ switch_mm_irqs_off(NULL, prev_mm, current);
/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(text_poke_mm));
@@ -2228,7 +2224,7 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
{
bool cross_page_boundary = offset_in_page(addr) + len > PAGE_SIZE;
struct page *pages[2] = {NULL};
- temp_mm_state_t prev;
+ struct mm_struct *prev_mm;
unsigned long flags;
pte_t pte, *ptep;
spinlock_t *ptl;
@@ -2286,7 +2282,7 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
* Loading the temporary mm behaves as a compiler barrier, which
* guarantees that the PTE will be set at the time memcpy() is done.
*/
- prev = use_temporary_mm(text_poke_mm);
+ prev_mm = use_temporary_mm(text_poke_mm);
kasan_disable_current();
func((u8 *)text_poke_addr + offset_in_page(addr), src, len);
@@ -2307,7 +2303,7 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
* instruction that already allows the core to see the updated version.
* Xen-PV is assumed to serialize execution in a similar manner.
*/
- unuse_temporary_mm(prev);
+ unuse_temporary_mm(prev_mm);
/*
* Flushing the TLB might involve IPIs, which would require enabled
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 13/41] x86/alternatives: Rename 'text_poke_flush()' to 'text_poke_int3_flush()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (11 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 12/41] x86/alternatives: Remove the confusing, inaccurate & unnecessary 'temp_mm_state_t' abstraction Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 14/41] x86/alternatives: Rename 'text_poke_finish()' to 'text_poke_int3_finish()' Ingo Molnar
` (28 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
This name is actually actively confusing, because the simple text_poke*()
APIs use MM-switching based code patching, while text_poke_flush()
is part of the INT3 based text_poke_int3_*() machinery that is an
additional layer of functionality on top of regular text_poke*() functionality.
Rename it to text_poke_int3_flush() to make it clear which layer
it belongs to.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 557ee2546177..bf8080a68f66 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2858,7 +2858,7 @@ static bool tp_order_fail(void *addr)
return false;
}
-static void text_poke_flush(void *addr)
+static void text_poke_int3_flush(void *addr)
{
if (tp_vec_nr == TP_VEC_MAX || tp_order_fail(addr)) {
text_poke_int3_batch(tp_vec, tp_vec_nr);
@@ -2868,14 +2868,14 @@ static void text_poke_flush(void *addr)
void text_poke_finish(void)
{
- text_poke_flush(NULL);
+ text_poke_int3_flush(NULL);
}
void __ref text_poke_queue(void *addr, const void *opcode, size_t len, const void *emulate)
{
struct text_poke_loc *tp;
- text_poke_flush(addr);
+ text_poke_int3_flush(addr);
tp = &tp_vec[tp_vec_nr++];
text_poke_loc_init(tp, addr, opcode, len, emulate);
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 14/41] x86/alternatives: Rename 'text_poke_finish()' to 'text_poke_int3_finish()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (12 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 13/41] x86/alternatives: Rename 'text_poke_flush()' to 'text_poke_int3_flush()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 15/41] x86/alternatives: Rename 'text_poke_queue()' to 'text_poke_int3_queue()' Ingo Molnar
` (27 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
This name is actively confusing as well, because the simple text_poke*()
APIs use MM-switching based code patching, while text_poke_finish()
is part of the INT3 based text_poke_int3_*() machinery that is an
additional layer of functionality on top of regular text_poke*() functionality.
Rename it to text_poke_int3_finish() to make it clear which layer
it belongs to.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/text-patching.h | 2 +-
arch/x86/kernel/alternative.c | 2 +-
arch/x86/kernel/ftrace.c | 4 ++--
arch/x86/kernel/jump_label.c | 2 +-
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index c2dbb0e4d80d..43c5f3aecf02 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -42,7 +42,7 @@ extern int text_poke_int3_handler(struct pt_regs *regs);
extern void text_poke_int3(void *addr, const void *opcode, size_t len, const void *emulate);
extern void text_poke_queue(void *addr, const void *opcode, size_t len, const void *emulate);
-extern void text_poke_finish(void);
+extern void text_poke_int3_finish(void);
#define INT3_INSN_SIZE 1
#define INT3_INSN_OPCODE 0xCC
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index bf8080a68f66..cc86c1399a7b 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2866,7 +2866,7 @@ static void text_poke_int3_flush(void *addr)
}
}
-void text_poke_finish(void)
+void text_poke_int3_finish(void)
{
text_poke_int3_flush(NULL);
}
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 4e284ff674f1..7ab5657078a4 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -58,7 +58,7 @@ void ftrace_arch_code_modify_post_process(void)
* module load, and we need to finish the text_poke_queue()
* that they do, here.
*/
- text_poke_finish();
+ text_poke_int3_finish();
ftrace_poke_late = 0;
mutex_unlock(&text_mutex);
}
@@ -250,7 +250,7 @@ void ftrace_replace_code(int enable)
text_poke_queue((void *)rec->ip, new, MCOUNT_INSN_SIZE, NULL);
ftrace_update_record(rec, enable);
}
- text_poke_finish();
+ text_poke_int3_finish();
}
void arch_ftrace_update_code(int command)
diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
index 94e2dcc94d9d..5a1adf229fcf 100644
--- a/arch/x86/kernel/jump_label.c
+++ b/arch/x86/kernel/jump_label.c
@@ -143,6 +143,6 @@ bool arch_jump_label_transform_queue(struct jump_entry *entry,
void arch_jump_label_transform_apply(void)
{
mutex_lock(&text_mutex);
- text_poke_finish();
+ text_poke_int3_finish();
mutex_unlock(&text_mutex);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 15/41] x86/alternatives: Rename 'text_poke_queue()' to 'text_poke_int3_queue()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (13 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 14/41] x86/alternatives: Rename 'text_poke_finish()' to 'text_poke_int3_finish()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 16/41] x86/alternatives: Rename 'text_poke_loc_init()' to 'text_poke_int3_loc_init()' Ingo Molnar
` (26 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
This name is actively confusing as well, because the simple text_poke*()
APIs use MM-switching based code patching, while text_poke_queue()
is part of the INT3 based text_poke_int3_*() machinery that is an
additional layer of functionality on top of regular text_poke*() functionality.
Rename it to text_poke_int3_queue() to make it clear which layer
it belongs to.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/text-patching.h | 2 +-
arch/x86/kernel/alternative.c | 2 +-
arch/x86/kernel/ftrace.c | 6 +++---
arch/x86/kernel/jump_label.c | 2 +-
4 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index 43c5f3aecf02..7deb06aec467 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -41,7 +41,7 @@ extern void *text_poke_set(void *addr, int c, size_t len);
extern int text_poke_int3_handler(struct pt_regs *regs);
extern void text_poke_int3(void *addr, const void *opcode, size_t len, const void *emulate);
-extern void text_poke_queue(void *addr, const void *opcode, size_t len, const void *emulate);
+extern void text_poke_int3_queue(void *addr, const void *opcode, size_t len, const void *emulate);
extern void text_poke_int3_finish(void);
#define INT3_INSN_SIZE 1
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index cc86c1399a7b..89ab3a11f26e 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2871,7 +2871,7 @@ void text_poke_int3_finish(void)
text_poke_int3_flush(NULL);
}
-void __ref text_poke_queue(void *addr, const void *opcode, size_t len, const void *emulate)
+void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, const void *emulate)
{
struct text_poke_loc *tp;
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 7ab5657078a4..ff3cdd08f28f 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -55,7 +55,7 @@ void ftrace_arch_code_modify_post_process(void)
{
/*
* ftrace_make_{call,nop}() may be called during
- * module load, and we need to finish the text_poke_queue()
+ * module load, and we need to finish the text_poke_int3_queue()
* that they do, here.
*/
text_poke_int3_finish();
@@ -119,7 +119,7 @@ ftrace_modify_code_direct(unsigned long ip, const char *old_code,
/* replace the text with the new text */
if (ftrace_poke_late)
- text_poke_queue((void *)ip, new_code, MCOUNT_INSN_SIZE, NULL);
+ text_poke_int3_queue((void *)ip, new_code, MCOUNT_INSN_SIZE, NULL);
else
text_poke_early((void *)ip, new_code, MCOUNT_INSN_SIZE);
return 0;
@@ -247,7 +247,7 @@ void ftrace_replace_code(int enable)
break;
}
- text_poke_queue((void *)rec->ip, new, MCOUNT_INSN_SIZE, NULL);
+ text_poke_int3_queue((void *)rec->ip, new, MCOUNT_INSN_SIZE, NULL);
ftrace_update_record(rec, enable);
}
text_poke_int3_finish();
diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
index 5a1adf229fcf..f72738e6d7d4 100644
--- a/arch/x86/kernel/jump_label.c
+++ b/arch/x86/kernel/jump_label.c
@@ -135,7 +135,7 @@ bool arch_jump_label_transform_queue(struct jump_entry *entry,
mutex_lock(&text_mutex);
jlp = __jump_label_patch(entry, type);
- text_poke_queue((void *)jump_entry_code(entry), jlp.code, jlp.size, NULL);
+ text_poke_int3_queue((void *)jump_entry_code(entry), jlp.code, jlp.size, NULL);
mutex_unlock(&text_mutex);
return true;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 16/41] x86/alternatives: Rename 'text_poke_loc_init()' to 'text_poke_int3_loc_init()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (14 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 15/41] x86/alternatives: Rename 'text_poke_queue()' to 'text_poke_int3_queue()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 17/41] x86/alternatives: Rename 'struct text_poke_loc' to 'struct text_poke_int3_loc' Ingo Molnar
` (25 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
This name is actively confusing as well, because the simple text_poke*()
APIs use MM-switching based code patching, while text_poke_loc_init()
is part of the INT3 based text_poke_int3_*() machinery that is an
additional layer of functionality on top of regular text_poke*() functionality.
Rename it to text_poke_int3_loc_init() to make it clear which layer
it belongs to.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 89ab3a11f26e..64355aa25402 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2762,7 +2762,7 @@ static void text_poke_int3_batch(struct text_poke_loc *tp, unsigned int nr_entri
}
}
-static void text_poke_loc_init(struct text_poke_loc *tp, void *addr,
+static void text_poke_int3_loc_init(struct text_poke_loc *tp, void *addr,
const void *opcode, size_t len, const void *emulate)
{
struct insn insn;
@@ -2878,7 +2878,7 @@ void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, cons
text_poke_int3_flush(addr);
tp = &tp_vec[tp_vec_nr++];
- text_poke_loc_init(tp, addr, opcode, len, emulate);
+ text_poke_int3_loc_init(tp, addr, opcode, len, emulate);
}
/**
@@ -2896,6 +2896,6 @@ void __ref text_poke_int3(void *addr, const void *opcode, size_t len, const void
{
struct text_poke_loc tp;
- text_poke_loc_init(&tp, addr, opcode, len, emulate);
+ text_poke_int3_loc_init(&tp, addr, opcode, len, emulate);
text_poke_int3_batch(&tp, 1);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 17/41] x86/alternatives: Rename 'struct text_poke_loc' to 'struct text_poke_int3_loc'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (15 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 16/41] x86/alternatives: Rename 'text_poke_loc_init()' to 'text_poke_int3_loc_init()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 18/41] x86/alternatives: Rename 'struct int3_patching_desc' to 'struct text_poke_int3_vec' Ingo Molnar
` (24 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Make it clear that this structure is part of the INT3 based
patching facility, not the regular text_poke*() MM-switch
based facility.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 64355aa25402..62aead1bd671 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2455,7 +2455,7 @@ void text_poke_sync(void)
* this thing. When len == 6 everything is prefixed with 0x0f and we map
* opcode to Jcc.d8, using len to distinguish.
*/
-struct text_poke_loc {
+struct text_poke_int3_loc {
/* addr := _stext + rel_addr */
s32 rel_addr;
s32 disp;
@@ -2467,7 +2467,7 @@ struct text_poke_loc {
};
struct int3_patching_desc {
- struct text_poke_loc *vec;
+ struct text_poke_int3_loc *vec;
int nr_entries;
};
@@ -2494,14 +2494,14 @@ static __always_inline void put_desc(void)
raw_atomic_dec(refs);
}
-static __always_inline void *text_poke_int3_addr(struct text_poke_loc *tp)
+static __always_inline void *text_poke_int3_addr(struct text_poke_int3_loc *tp)
{
return _stext + tp->rel_addr;
}
static __always_inline int patch_cmp(const void *key, const void *elt)
{
- struct text_poke_loc *tp = (struct text_poke_loc *) elt;
+ struct text_poke_int3_loc *tp = (struct text_poke_int3_loc *) elt;
if (key < text_poke_int3_addr(tp))
return -1;
@@ -2513,7 +2513,7 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
noinstr int text_poke_int3_handler(struct pt_regs *regs)
{
struct int3_patching_desc *desc;
- struct text_poke_loc *tp;
+ struct text_poke_int3_loc *tp;
int ret = 0;
void *ip;
@@ -2544,7 +2544,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
*/
if (unlikely(desc->nr_entries > 1)) {
tp = __inline_bsearch(ip, desc->vec, desc->nr_entries,
- sizeof(struct text_poke_loc),
+ sizeof(struct text_poke_int3_loc),
patch_cmp);
if (!tp)
goto out_put;
@@ -2592,8 +2592,8 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
return ret;
}
-#define TP_VEC_MAX (PAGE_SIZE / sizeof(struct text_poke_loc))
-static struct text_poke_loc tp_vec[TP_VEC_MAX];
+#define TP_VEC_MAX (PAGE_SIZE / sizeof(struct text_poke_int3_loc))
+static struct text_poke_int3_loc tp_vec[TP_VEC_MAX];
static int tp_vec_nr;
/**
@@ -2617,7 +2617,7 @@ static int tp_vec_nr;
* replacing opcode
* - sync cores
*/
-static void text_poke_int3_batch(struct text_poke_loc *tp, unsigned int nr_entries)
+static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_entries)
{
unsigned char int3 = INT3_INSN_OPCODE;
unsigned int i;
@@ -2762,7 +2762,7 @@ static void text_poke_int3_batch(struct text_poke_loc *tp, unsigned int nr_entri
}
}
-static void text_poke_int3_loc_init(struct text_poke_loc *tp, void *addr,
+static void text_poke_int3_loc_init(struct text_poke_int3_loc *tp, void *addr,
const void *opcode, size_t len, const void *emulate)
{
struct insn insn;
@@ -2843,7 +2843,7 @@ static void text_poke_int3_loc_init(struct text_poke_loc *tp, void *addr,
*/
static bool tp_order_fail(void *addr)
{
- struct text_poke_loc *tp;
+ struct text_poke_int3_loc *tp;
if (!tp_vec_nr)
return false;
@@ -2873,7 +2873,7 @@ void text_poke_int3_finish(void)
void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, const void *emulate)
{
- struct text_poke_loc *tp;
+ struct text_poke_int3_loc *tp;
text_poke_int3_flush(addr);
@@ -2894,7 +2894,7 @@ void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, cons
*/
void __ref text_poke_int3(void *addr, const void *opcode, size_t len, const void *emulate)
{
- struct text_poke_loc tp;
+ struct text_poke_int3_loc tp;
text_poke_int3_loc_init(&tp, addr, opcode, len, emulate);
text_poke_int3_batch(&tp, 1);
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 18/41] x86/alternatives: Rename 'struct int3_patching_desc' to 'struct text_poke_int3_vec'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (16 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 17/41] x86/alternatives: Rename 'struct text_poke_loc' to 'struct text_poke_int3_loc' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 19/41] x86/alternatives: Rename 'int3_desc' to 'int3_vec' Ingo Molnar
` (23 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Follow the INT3 text-poking nomenclature, and also adopt the
'vector' name for the entire object, instead of the rather
opaque 'descriptor' naming.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 62aead1bd671..84c26d037f05 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2466,17 +2466,17 @@ struct text_poke_int3_loc {
u8 old;
};
-struct int3_patching_desc {
+struct text_poke_int3_vec {
struct text_poke_int3_loc *vec;
int nr_entries;
};
static DEFINE_PER_CPU(atomic_t, int3_refs);
-static struct int3_patching_desc int3_desc;
+static struct text_poke_int3_vec int3_desc;
static __always_inline
-struct int3_patching_desc *try_get_desc(void)
+struct text_poke_int3_vec *try_get_desc(void)
{
atomic_t *refs = this_cpu_ptr(&int3_refs);
@@ -2512,7 +2512,7 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
noinstr int text_poke_int3_handler(struct pt_regs *regs)
{
- struct int3_patching_desc *desc;
+ struct text_poke_int3_vec *desc;
struct text_poke_int3_loc *tp;
int ret = 0;
void *ip;
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 19/41] x86/alternatives: Rename 'int3_desc' to 'int3_vec'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (17 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 18/41] x86/alternatives: Rename 'struct int3_patching_desc' to 'struct text_poke_int3_vec' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 20/41] x86/alternatives: Add text_mutex) assert to text_poke_int3_flush() Ingo Molnar
` (22 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 84c26d037f05..a10e1b9db7b4 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2473,7 +2473,7 @@ struct text_poke_int3_vec {
static DEFINE_PER_CPU(atomic_t, int3_refs);
-static struct text_poke_int3_vec int3_desc;
+static struct text_poke_int3_vec int3_vec;
static __always_inline
struct text_poke_int3_vec *try_get_desc(void)
@@ -2483,7 +2483,7 @@ struct text_poke_int3_vec *try_get_desc(void)
if (!raw_atomic_inc_not_zero(refs))
return NULL;
- return &int3_desc;
+ return &int3_vec;
}
static __always_inline void put_desc(void)
@@ -2522,7 +2522,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
/*
* Having observed our INT3 instruction, we now must observe
- * int3_desc with non-zero refcount:
+ * int3_vec with non-zero refcount:
*
* int3_refs = 1 INT3
* WMB RMB
@@ -2625,12 +2625,12 @@ static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_
lockdep_assert_held(&text_mutex);
- int3_desc.vec = tp;
- int3_desc.nr_entries = nr_entries;
+ int3_vec.vec = tp;
+ int3_vec.nr_entries = nr_entries;
/*
* Corresponds to the implicit memory barrier in try_get_desc() to
- * ensure reading a non-zero refcount provides up to date int3_desc data.
+ * ensure reading a non-zero refcount provides up to date int3_vec data.
*/
for_each_possible_cpu(i)
atomic_set_release(per_cpu_ptr(&int3_refs, i), 1);
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 20/41] x86/alternatives: Add text_mutex) assert to text_poke_int3_flush()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (18 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 19/41] x86/alternatives: Rename 'int3_desc' to 'int3_vec' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 21/41] x86/alternatives: Assert that text_poke_int3_handler() can only ever handle 'tp_vec[]' based requests Ingo Molnar
` (21 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
It's possible to escape the text_mutex-held assert in
text_poke_int3_batch() if the caller uses a properly
batched and sorted series of patch requests, so add
an explicit lockdep_assert_held() to make sure it's
held by all callers.
All text_poke_int3_*() APIs will call either text_poke_int3_batch()
or text_poke_int3_flush() internally.
The text_mutex must be held, because tp_vec and tp_vec_nr et al
are all globals, and the INT3 patching machinery itself relies on
external serialization.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index a10e1b9db7b4..f75806d699be 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2860,6 +2860,8 @@ static bool tp_order_fail(void *addr)
static void text_poke_int3_flush(void *addr)
{
+ lockdep_assert_held(&text_mutex);
+
if (tp_vec_nr == TP_VEC_MAX || tp_order_fail(addr)) {
text_poke_int3_batch(tp_vec, tp_vec_nr);
tp_vec_nr = 0;
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 21/41] x86/alternatives: Assert that text_poke_int3_handler() can only ever handle 'tp_vec[]' based requests
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (19 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 20/41] x86/alternatives: Add text_mutex) assert to text_poke_int3_flush() Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 22/41] x86/alternatives: Use non-inverted logic instead of 'tp_order_fail()' Ingo Molnar
` (20 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index f75806d699be..883c2146ce54 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2510,6 +2510,10 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
return 0;
}
+#define TP_VEC_MAX (PAGE_SIZE / sizeof(struct text_poke_int3_loc))
+static struct text_poke_int3_loc tp_vec[TP_VEC_MAX];
+static int tp_vec_nr;
+
noinstr int text_poke_int3_handler(struct pt_regs *regs)
{
struct text_poke_int3_vec *desc;
@@ -2534,6 +2538,8 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
if (!desc)
return 0;
+ WARN_ON_ONCE(desc->vec != tp_vec);
+
/*
* Discount the INT3. See text_poke_int3_batch().
*/
@@ -2592,10 +2598,6 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
return ret;
}
-#define TP_VEC_MAX (PAGE_SIZE / sizeof(struct text_poke_int3_loc))
-static struct text_poke_int3_loc tp_vec[TP_VEC_MAX];
-static int tp_vec_nr;
-
/**
* text_poke_int3_batch() -- update instructions on live kernel on SMP
* @tp: vector of instructions to patch
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 22/41] x86/alternatives: Use non-inverted logic instead of 'tp_order_fail()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (20 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 21/41] x86/alternatives: Assert that text_poke_int3_handler() can only ever handle 'tp_vec[]' based requests Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 23/41] x86/alternatives: Remove the 'addr == NULL means forced-flush' hack from text_poke_int3_finish()/text_poke_int3_flush()/tp_addr_ordered() Ingo Molnar
` (19 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
tp_order_fail() uses inverted logic: it returns true in case something
is false, which is only a plus at the IOCCC.
Instead rename it to regular parity as 'tp_addr_ordered()',
and adjust the code accordingly.
Also add a comment explaining how the address ordering should be
understood.
No change in functionality intended.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 883c2146ce54..938e8e70a379 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2843,28 +2843,34 @@ static void text_poke_int3_loc_init(struct text_poke_int3_loc *tp, void *addr,
* We hard rely on the tp_vec being ordered; ensure this is so by flushing
* early if needed.
*/
-static bool tp_order_fail(void *addr)
+static bool tp_addr_ordered(void *addr)
{
struct text_poke_int3_loc *tp;
if (!tp_vec_nr)
- return false;
+ return true;
if (!addr) /* force */
- return true;
+ return false;
- tp = &tp_vec[tp_vec_nr - 1];
+ /*
+ * If the last current entry's address is higher than the
+ * new entry's address we'd like to add, then ordering
+ * is violated and we must first flush all pending patching
+ * requests:
+ */
+ tp = &tp_vec[tp_vec_nr-1];
if ((unsigned long)text_poke_int3_addr(tp) > (unsigned long)addr)
- return true;
+ return false;
- return false;
+ return true;
}
static void text_poke_int3_flush(void *addr)
{
lockdep_assert_held(&text_mutex);
- if (tp_vec_nr == TP_VEC_MAX || tp_order_fail(addr)) {
+ if (tp_vec_nr == TP_VEC_MAX || !tp_addr_ordered(addr)) {
text_poke_int3_batch(tp_vec, tp_vec_nr);
tp_vec_nr = 0;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 23/41] x86/alternatives: Remove the 'addr == NULL means forced-flush' hack from text_poke_int3_finish()/text_poke_int3_flush()/tp_addr_ordered()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (21 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 22/41] x86/alternatives: Use non-inverted logic instead of 'tp_order_fail()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 24/41] x86/alternatives: Simplify text_poke_int3() by using tp_vec and existing APIs Ingo Molnar
` (18 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
There's this weird hack used by text_poke_int3_finish() to indicate
a 'forced flush':
text_poke_int3_flush(NULL);
Just open-code the vector-flush in a straightforward fashion:
text_poke_int3_batch(tp_vec, tp_vec_nr);
tp_vec_nr = 0;
And get rid of !addr hack from tp_addr_ordered().
Leave a WARN_ON_ONCE(), just in case some external code learned
to rely on this behavior.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 938e8e70a379..906fb45b9e65 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2847,12 +2847,11 @@ static bool tp_addr_ordered(void *addr)
{
struct text_poke_int3_loc *tp;
+ WARN_ON_ONCE(!addr);
+
if (!tp_vec_nr)
return true;
- if (!addr) /* force */
- return false;
-
/*
* If the last current entry's address is higher than the
* new entry's address we'd like to add, then ordering
@@ -2866,6 +2865,14 @@ static bool tp_addr_ordered(void *addr)
return true;
}
+void text_poke_int3_finish(void)
+{
+ if (tp_vec_nr) {
+ text_poke_int3_batch(tp_vec, tp_vec_nr);
+ tp_vec_nr = 0;
+ }
+}
+
static void text_poke_int3_flush(void *addr)
{
lockdep_assert_held(&text_mutex);
@@ -2876,11 +2883,6 @@ static void text_poke_int3_flush(void *addr)
}
}
-void text_poke_int3_finish(void)
-{
- text_poke_int3_flush(NULL);
-}
-
void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, const void *emulate)
{
struct text_poke_int3_loc *tp;
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 24/41] x86/alternatives: Simplify text_poke_int3() by using tp_vec and existing APIs
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (22 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 23/41] x86/alternatives: Remove the 'addr == NULL means forced-flush' hack from text_poke_int3_finish()/text_poke_int3_flush()/tp_addr_ordered() Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 25/41] x86/alternatives: Assert input parameters in text_poke_int3_batch() Ingo Molnar
` (17 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Instead of constructing a vector on-stack, just use the already
available batch-patching vector - which should always be empty
at this point.
This will allow subsequent simplifications.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 906fb45b9e65..4f23f6b4d51d 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2906,8 +2906,13 @@ void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, cons
*/
void __ref text_poke_int3(void *addr, const void *opcode, size_t len, const void *emulate)
{
- struct text_poke_int3_loc tp;
+ struct text_poke_int3_loc *tp;
+
+ /* Batch-patching should not be mixed with single-patching: */
+ WARN_ON_ONCE(tp_vec_nr != 0);
+
+ tp = &tp_vec[tp_vec_nr++];
+ text_poke_int3_loc_init(tp, addr, opcode, len, emulate);
- text_poke_int3_loc_init(&tp, addr, opcode, len, emulate);
- text_poke_int3_batch(&tp, 1);
+ text_poke_int3_finish();
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 25/41] x86/alternatives: Assert input parameters in text_poke_int3_batch()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (23 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 24/41] x86/alternatives: Simplify text_poke_int3() by using tp_vec and existing APIs Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 26/41] x86/alternatives: Introduce 'struct text_poke_int3_array' and move tp_vec and tp_vec_nr to it Ingo Molnar
` (16 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
At this point the 'tp' input parameter must always be the
global 'tp_vec' array, and 'nr_entries' must always be equal
to 'tp_vec_nr'.
Assert these conditions - which will allow the removal of
a layer of indirection between these values.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 4f23f6b4d51d..393d796e797d 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2627,6 +2627,9 @@ static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_
lockdep_assert_held(&text_mutex);
+ WARN_ON_ONCE(tp != tp_vec);
+ WARN_ON_ONCE(nr_entries != tp_vec_nr);
+
int3_vec.vec = tp;
int3_vec.nr_entries = nr_entries;
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 26/41] x86/alternatives: Introduce 'struct text_poke_int3_array' and move tp_vec and tp_vec_nr to it
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (24 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 25/41] x86/alternatives: Assert input parameters in text_poke_int3_batch() Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 27/41] x86/alternatives: Remove the tp_vec indirection Ingo Molnar
` (15 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
struct text_poke_array is an equivalent structure to these global variables:
static struct text_poke_int3_loc tp_vec[TP_VEC_MAX];
static int tp_vec_nr;
Note that we intentionally mirror much of the naming of
'struct text_poke_int3_vec', which will further highlight
the unecessary layering going on in this code, and will
ease its removal.
No change in functionality.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 43 +++++++++++++++++++++++--------------------
1 file changed, 23 insertions(+), 20 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 393d796e797d..cf3bcaa97957 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2467,14 +2467,21 @@ struct text_poke_int3_loc {
};
struct text_poke_int3_vec {
- struct text_poke_int3_loc *vec;
int nr_entries;
+ struct text_poke_int3_loc *vec;
};
static DEFINE_PER_CPU(atomic_t, int3_refs);
static struct text_poke_int3_vec int3_vec;
+#define TP_ARRAY_NR_ENTRIES_MAX (PAGE_SIZE / sizeof(struct text_poke_int3_loc))
+
+static struct text_poke_int3_array {
+ int nr_entries;
+ struct text_poke_int3_loc vec[TP_ARRAY_NR_ENTRIES_MAX];
+} tp_array;
+
static __always_inline
struct text_poke_int3_vec *try_get_desc(void)
{
@@ -2510,10 +2517,6 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
return 0;
}
-#define TP_VEC_MAX (PAGE_SIZE / sizeof(struct text_poke_int3_loc))
-static struct text_poke_int3_loc tp_vec[TP_VEC_MAX];
-static int tp_vec_nr;
-
noinstr int text_poke_int3_handler(struct pt_regs *regs)
{
struct text_poke_int3_vec *desc;
@@ -2538,7 +2541,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
if (!desc)
return 0;
- WARN_ON_ONCE(desc->vec != tp_vec);
+ WARN_ON_ONCE(desc->vec != tp_array.vec);
/*
* Discount the INT3. See text_poke_int3_batch().
@@ -2627,8 +2630,8 @@ static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_
lockdep_assert_held(&text_mutex);
- WARN_ON_ONCE(tp != tp_vec);
- WARN_ON_ONCE(nr_entries != tp_vec_nr);
+ WARN_ON_ONCE(tp != tp_array.vec);
+ WARN_ON_ONCE(nr_entries != tp_array.nr_entries);
int3_vec.vec = tp;
int3_vec.nr_entries = nr_entries;
@@ -2843,7 +2846,7 @@ static void text_poke_int3_loc_init(struct text_poke_int3_loc *tp, void *addr,
}
/*
- * We hard rely on the tp_vec being ordered; ensure this is so by flushing
+ * We hard rely on the tp_array.vec being ordered; ensure this is so by flushing
* early if needed.
*/
static bool tp_addr_ordered(void *addr)
@@ -2852,7 +2855,7 @@ static bool tp_addr_ordered(void *addr)
WARN_ON_ONCE(!addr);
- if (!tp_vec_nr)
+ if (!tp_array.nr_entries)
return true;
/*
@@ -2861,7 +2864,7 @@ static bool tp_addr_ordered(void *addr)
* is violated and we must first flush all pending patching
* requests:
*/
- tp = &tp_vec[tp_vec_nr-1];
+ tp = &tp_array.vec[tp_array.nr_entries-1];
if ((unsigned long)text_poke_int3_addr(tp) > (unsigned long)addr)
return false;
@@ -2870,9 +2873,9 @@ static bool tp_addr_ordered(void *addr)
void text_poke_int3_finish(void)
{
- if (tp_vec_nr) {
- text_poke_int3_batch(tp_vec, tp_vec_nr);
- tp_vec_nr = 0;
+ if (tp_array.nr_entries) {
+ text_poke_int3_batch(tp_array.vec, tp_array.nr_entries);
+ tp_array.nr_entries = 0;
}
}
@@ -2880,9 +2883,9 @@ static void text_poke_int3_flush(void *addr)
{
lockdep_assert_held(&text_mutex);
- if (tp_vec_nr == TP_VEC_MAX || !tp_addr_ordered(addr)) {
- text_poke_int3_batch(tp_vec, tp_vec_nr);
- tp_vec_nr = 0;
+ if (tp_array.nr_entries == TP_ARRAY_NR_ENTRIES_MAX || !tp_addr_ordered(addr)) {
+ text_poke_int3_batch(tp_array.vec, tp_array.nr_entries);
+ tp_array.nr_entries = 0;
}
}
@@ -2892,7 +2895,7 @@ void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, cons
text_poke_int3_flush(addr);
- tp = &tp_vec[tp_vec_nr++];
+ tp = &tp_array.vec[tp_array.nr_entries++];
text_poke_int3_loc_init(tp, addr, opcode, len, emulate);
}
@@ -2912,9 +2915,9 @@ void __ref text_poke_int3(void *addr, const void *opcode, size_t len, const void
struct text_poke_int3_loc *tp;
/* Batch-patching should not be mixed with single-patching: */
- WARN_ON_ONCE(tp_vec_nr != 0);
+ WARN_ON_ONCE(tp_array.nr_entries != 0);
- tp = &tp_vec[tp_vec_nr++];
+ tp = &tp_array.vec[tp_array.nr_entries++];
text_poke_int3_loc_init(tp, addr, opcode, len, emulate);
text_poke_int3_finish();
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 27/41] x86/alternatives: Remove the tp_vec indirection
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (25 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 26/41] x86/alternatives: Introduce 'struct text_poke_int3_array' and move tp_vec and tp_vec_nr to it Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 28/41] x86/alternatives: Rename 'try_get_desc()' to 'try_get_tp_array()' Ingo Molnar
` (14 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
At this point we are always working out of an uptodate
tp_array, there's no need for text_poke_int3_handler()
to read via the int3_vec indirection - remove it.
This simplifies the code:
1 file changed, 5 insertions(+), 15 deletions(-)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 24 +++++++-----------------
1 file changed, 7 insertions(+), 17 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index cf3bcaa97957..3baef1827f3c 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2466,15 +2466,6 @@ struct text_poke_int3_loc {
u8 old;
};
-struct text_poke_int3_vec {
- int nr_entries;
- struct text_poke_int3_loc *vec;
-};
-
-static DEFINE_PER_CPU(atomic_t, int3_refs);
-
-static struct text_poke_int3_vec int3_vec;
-
#define TP_ARRAY_NR_ENTRIES_MAX (PAGE_SIZE / sizeof(struct text_poke_int3_loc))
static struct text_poke_int3_array {
@@ -2482,15 +2473,17 @@ static struct text_poke_int3_array {
struct text_poke_int3_loc vec[TP_ARRAY_NR_ENTRIES_MAX];
} tp_array;
+static DEFINE_PER_CPU(atomic_t, int3_refs);
+
static __always_inline
-struct text_poke_int3_vec *try_get_desc(void)
+struct text_poke_int3_array *try_get_desc(void)
{
atomic_t *refs = this_cpu_ptr(&int3_refs);
if (!raw_atomic_inc_not_zero(refs))
return NULL;
- return &int3_vec;
+ return &tp_array;
}
static __always_inline void put_desc(void)
@@ -2519,7 +2512,7 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
noinstr int text_poke_int3_handler(struct pt_regs *regs)
{
- struct text_poke_int3_vec *desc;
+ struct text_poke_int3_array *desc;
struct text_poke_int3_loc *tp;
int ret = 0;
void *ip;
@@ -2529,7 +2522,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
/*
* Having observed our INT3 instruction, we now must observe
- * int3_vec with non-zero refcount:
+ * tp_array with non-zero refcount:
*
* int3_refs = 1 INT3
* WMB RMB
@@ -2633,12 +2626,9 @@ static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_
WARN_ON_ONCE(tp != tp_array.vec);
WARN_ON_ONCE(nr_entries != tp_array.nr_entries);
- int3_vec.vec = tp;
- int3_vec.nr_entries = nr_entries;
-
/*
* Corresponds to the implicit memory barrier in try_get_desc() to
- * ensure reading a non-zero refcount provides up to date int3_vec data.
+ * ensure reading a non-zero refcount provides up to date tp_array data.
*/
for_each_possible_cpu(i)
atomic_set_release(per_cpu_ptr(&int3_refs, i), 1);
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 28/41] x86/alternatives: Rename 'try_get_desc()' to 'try_get_tp_array()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (26 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 27/41] x86/alternatives: Remove the tp_vec indirection Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 29/41] x86/alternatives: Rename 'put_desc()' to 'put_tp_array()' Ingo Molnar
` (13 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
This better reflects what the underlying code is doing,
there's no 'descriptor' indirection anymore.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 3baef1827f3c..4b5ab9002e07 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2476,7 +2476,7 @@ static struct text_poke_int3_array {
static DEFINE_PER_CPU(atomic_t, int3_refs);
static __always_inline
-struct text_poke_int3_array *try_get_desc(void)
+struct text_poke_int3_array *try_get_tp_array(void)
{
atomic_t *refs = this_cpu_ptr(&int3_refs);
@@ -2530,7 +2530,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
*/
smp_rmb();
- desc = try_get_desc();
+ desc = try_get_tp_array();
if (!desc)
return 0;
@@ -2627,7 +2627,7 @@ static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_
WARN_ON_ONCE(nr_entries != tp_array.nr_entries);
/*
- * Corresponds to the implicit memory barrier in try_get_desc() to
+ * Corresponds to the implicit memory barrier in try_get_tp_array() to
* ensure reading a non-zero refcount provides up to date tp_array data.
*/
for_each_possible_cpu(i)
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 29/41] x86/alternatives: Rename 'put_desc()' to 'put_tp_array()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (27 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 28/41] x86/alternatives: Rename 'try_get_desc()' to 'try_get_tp_array()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 30/41] x86/alternatives: Simplify try_get_tp_array() Ingo Molnar
` (12 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Just like with try_get_tp_array(), this name better reflects
what the underlying code is doing, there's no 'descriptor'
indirection anymore.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 4b5ab9002e07..0b11f53d6e6d 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2486,7 +2486,7 @@ struct text_poke_int3_array *try_get_tp_array(void)
return &tp_array;
}
-static __always_inline void put_desc(void)
+static __always_inline void put_tp_array(void)
{
atomic_t *refs = this_cpu_ptr(&int3_refs);
@@ -2590,7 +2590,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
ret = 1;
out_put:
- put_desc();
+ put_tp_array();
return ret;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 30/41] x86/alternatives: Simplify try_get_tp_array()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (28 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 29/41] x86/alternatives: Rename 'put_desc()' to 'put_tp_array()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 31/41] x86/alternatives: Simplify text_poke_int3_handler() Ingo Molnar
` (11 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
There's no need to return a pointer on success - it's always
the same pointer.
Return a bool instead.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 0b11f53d6e6d..244119066672 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2475,15 +2475,14 @@ static struct text_poke_int3_array {
static DEFINE_PER_CPU(atomic_t, int3_refs);
-static __always_inline
-struct text_poke_int3_array *try_get_tp_array(void)
+static bool try_get_tp_array(void)
{
atomic_t *refs = this_cpu_ptr(&int3_refs);
if (!raw_atomic_inc_not_zero(refs))
- return NULL;
+ return false;
- return &tp_array;
+ return true;
}
static __always_inline void put_tp_array(void)
@@ -2530,9 +2529,9 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
*/
smp_rmb();
- desc = try_get_tp_array();
- if (!desc)
+ if (!try_get_tp_array())
return 0;
+ desc = &tp_array;
WARN_ON_ONCE(desc->vec != tp_array.vec);
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 31/41] x86/alternatives: Simplify text_poke_int3_handler()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (29 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 30/41] x86/alternatives: Simplify try_get_tp_array() Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 32/41] x86/alternatives: Simplify text_poke_int3_batch() Ingo Molnar
` (10 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Remove the 'desc' local variable indirection and use
tp_array directly.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 10 +++-------
1 file changed, 3 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 244119066672..9402826e2903 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2511,7 +2511,6 @@ static __always_inline int patch_cmp(const void *key, const void *elt)
noinstr int text_poke_int3_handler(struct pt_regs *regs)
{
- struct text_poke_int3_array *desc;
struct text_poke_int3_loc *tp;
int ret = 0;
void *ip;
@@ -2531,9 +2530,6 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
if (!try_get_tp_array())
return 0;
- desc = &tp_array;
-
- WARN_ON_ONCE(desc->vec != tp_array.vec);
/*
* Discount the INT3. See text_poke_int3_batch().
@@ -2543,14 +2539,14 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
/*
* Skip the binary search if there is a single member in the vector.
*/
- if (unlikely(desc->nr_entries > 1)) {
- tp = __inline_bsearch(ip, desc->vec, desc->nr_entries,
+ if (unlikely(tp_array.nr_entries > 1)) {
+ tp = __inline_bsearch(ip, tp_array.vec, tp_array.nr_entries,
sizeof(struct text_poke_int3_loc),
patch_cmp);
if (!tp)
goto out_put;
} else {
- tp = desc->vec;
+ tp = tp_array.vec;
if (text_poke_int3_addr(tp) != ip)
goto out_put;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 32/41] x86/alternatives: Simplify text_poke_int3_batch()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (30 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 31/41] x86/alternatives: Simplify text_poke_int3_handler() Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 33/41] x86/alternatives: Rename 'text_poke_int3_batch()' to 'text_poke_int3_batch_process()' Ingo Molnar
` (9 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
This function is now using the tp_array state exclusively,
make that explicit by removing the redundant input parameters.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 43 ++++++++++++++++++++-----------------------
1 file changed, 20 insertions(+), 23 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 9402826e2903..40e86b41bb86 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2591,8 +2591,8 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
/**
* text_poke_int3_batch() -- update instructions on live kernel on SMP
- * @tp: vector of instructions to patch
- * @nr_entries: number of entries in the vector
+ * @tp_array.vec: vector of instructions to patch
+ * @tp_array.nr_entries: number of entries in the vector
*
* Modify multi-byte instruction by using int3 breakpoint on SMP.
* We completely avoid stop_machine() here, and achieve the
@@ -2610,7 +2610,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
* replacing opcode
* - sync cores
*/
-static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_entries)
+static void text_poke_int3_batch(void)
{
unsigned char int3 = INT3_INSN_OPCODE;
unsigned int i;
@@ -2618,9 +2618,6 @@ static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_
lockdep_assert_held(&text_mutex);
- WARN_ON_ONCE(tp != tp_array.vec);
- WARN_ON_ONCE(nr_entries != tp_array.nr_entries);
-
/*
* Corresponds to the implicit memory barrier in try_get_tp_array() to
* ensure reading a non-zero refcount provides up to date tp_array data.
@@ -2640,16 +2637,16 @@ static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_
/*
* Corresponding read barrier in int3 notifier for making sure the
- * nr_entries and handler are correctly ordered wrt. patching.
+ * tp_array.nr_entries and handler are correctly ordered wrt. patching.
*/
smp_wmb();
/*
* First step: add a int3 trap to the address that will be patched.
*/
- for (i = 0; i < nr_entries; i++) {
- tp[i].old = *(u8 *)text_poke_int3_addr(&tp[i]);
- text_poke(text_poke_int3_addr(&tp[i]), &int3, INT3_INSN_SIZE);
+ for (i = 0; i < tp_array.nr_entries; i++) {
+ tp_array.vec[i].old = *(u8 *)text_poke_int3_addr(&tp_array.vec[i]);
+ text_poke(text_poke_int3_addr(&tp_array.vec[i]), &int3, INT3_INSN_SIZE);
}
text_poke_sync();
@@ -2657,15 +2654,15 @@ static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_
/*
* Second step: update all but the first byte of the patched range.
*/
- for (do_sync = 0, i = 0; i < nr_entries; i++) {
- u8 old[POKE_MAX_OPCODE_SIZE+1] = { tp[i].old, };
+ for (do_sync = 0, i = 0; i < tp_array.nr_entries; i++) {
+ u8 old[POKE_MAX_OPCODE_SIZE+1] = { tp_array.vec[i].old, };
u8 _new[POKE_MAX_OPCODE_SIZE+1];
- const u8 *new = tp[i].text;
- int len = tp[i].len;
+ const u8 *new = tp_array.vec[i].text;
+ int len = tp_array.vec[i].len;
if (len - INT3_INSN_SIZE > 0) {
memcpy(old + INT3_INSN_SIZE,
- text_poke_int3_addr(&tp[i]) + INT3_INSN_SIZE,
+ text_poke_int3_addr(&tp_array.vec[i]) + INT3_INSN_SIZE,
len - INT3_INSN_SIZE);
if (len == 6) {
@@ -2674,7 +2671,7 @@ static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_
new = _new;
}
- text_poke(text_poke_int3_addr(&tp[i]) + INT3_INSN_SIZE,
+ text_poke(text_poke_int3_addr(&tp_array.vec[i]) + INT3_INSN_SIZE,
new + INT3_INSN_SIZE,
len - INT3_INSN_SIZE);
@@ -2705,7 +2702,7 @@ static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_
* The old instruction is recorded so that the event can be
* processed forwards or backwards.
*/
- perf_event_text_poke(text_poke_int3_addr(&tp[i]), old, len, new, len);
+ perf_event_text_poke(text_poke_int3_addr(&tp_array.vec[i]), old, len, new, len);
}
if (do_sync) {
@@ -2721,16 +2718,16 @@ static void text_poke_int3_batch(struct text_poke_int3_loc *tp, unsigned int nr_
* Third step: replace the first byte (int3) by the first byte of
* replacing opcode.
*/
- for (do_sync = 0, i = 0; i < nr_entries; i++) {
- u8 byte = tp[i].text[0];
+ for (do_sync = 0, i = 0; i < tp_array.nr_entries; i++) {
+ u8 byte = tp_array.vec[i].text[0];
- if (tp[i].len == 6)
+ if (tp_array.vec[i].len == 6)
byte = 0x0f;
if (byte == INT3_INSN_OPCODE)
continue;
- text_poke(text_poke_int3_addr(&tp[i]), &byte, INT3_INSN_SIZE);
+ text_poke(text_poke_int3_addr(&tp_array.vec[i]), &byte, INT3_INSN_SIZE);
do_sync++;
}
@@ -2859,7 +2856,7 @@ static bool tp_addr_ordered(void *addr)
void text_poke_int3_finish(void)
{
if (tp_array.nr_entries) {
- text_poke_int3_batch(tp_array.vec, tp_array.nr_entries);
+ text_poke_int3_batch();
tp_array.nr_entries = 0;
}
}
@@ -2869,7 +2866,7 @@ static void text_poke_int3_flush(void *addr)
lockdep_assert_held(&text_mutex);
if (tp_array.nr_entries == TP_ARRAY_NR_ENTRIES_MAX || !tp_addr_ordered(addr)) {
- text_poke_int3_batch(tp_array.vec, tp_array.nr_entries);
+ text_poke_int3_batch();
tp_array.nr_entries = 0;
}
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 33/41] x86/alternatives: Rename 'text_poke_int3_batch()' to 'text_poke_int3_batch_process()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (31 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 32/41] x86/alternatives: Simplify text_poke_int3_batch() Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 34/41] x86/alternatives: Rename 'int3_refs' to 'tp_array_refs' Ingo Molnar
` (8 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Make it clear in the name that this is the function that does
the actual batch processing (patching).
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 40e86b41bb86..6c3850527bd5 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2462,7 +2462,7 @@ struct text_poke_int3_loc {
u8 len;
u8 opcode;
const u8 text[POKE_MAX_OPCODE_SIZE];
- /* see text_poke_int3_batch() */
+ /* see text_poke_int3_batch_process() */
u8 old;
};
@@ -2532,7 +2532,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
return 0;
/*
- * Discount the INT3. See text_poke_int3_batch().
+ * Discount the INT3. See text_poke_int3_batch_process().
*/
ip = (void *) regs->ip - INT3_INSN_SIZE;
@@ -2590,7 +2590,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
}
/**
- * text_poke_int3_batch() -- update instructions on live kernel on SMP
+ * text_poke_int3_batch_process() -- update instructions on live kernel on SMP
* @tp_array.vec: vector of instructions to patch
* @tp_array.nr_entries: number of entries in the vector
*
@@ -2610,7 +2610,7 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
* replacing opcode
* - sync cores
*/
-static void text_poke_int3_batch(void)
+static void text_poke_int3_batch_process(void)
{
unsigned char int3 = INT3_INSN_OPCODE;
unsigned int i;
@@ -2856,7 +2856,7 @@ static bool tp_addr_ordered(void *addr)
void text_poke_int3_finish(void)
{
if (tp_array.nr_entries) {
- text_poke_int3_batch();
+ text_poke_int3_batch_process();
tp_array.nr_entries = 0;
}
}
@@ -2866,7 +2866,7 @@ static void text_poke_int3_flush(void *addr)
lockdep_assert_held(&text_mutex);
if (tp_array.nr_entries == TP_ARRAY_NR_ENTRIES_MAX || !tp_addr_ordered(addr)) {
- text_poke_int3_batch();
+ text_poke_int3_batch_process();
tp_array.nr_entries = 0;
}
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 34/41] x86/alternatives: Rename 'int3_refs' to 'tp_array_refs'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (32 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 33/41] x86/alternatives: Rename 'text_poke_int3_batch()' to 'text_poke_int3_batch_process()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 35/41] x86/alternatives: Move the tp_array manipulation into text_poke_int3_loc_init() and rename it to text_poke_int3_loc_add() Ingo Molnar
` (7 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Make it clear that these reference counts lock access
to tp_array.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 6c3850527bd5..3ab40b0f5245 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2473,11 +2473,11 @@ static struct text_poke_int3_array {
struct text_poke_int3_loc vec[TP_ARRAY_NR_ENTRIES_MAX];
} tp_array;
-static DEFINE_PER_CPU(atomic_t, int3_refs);
+static DEFINE_PER_CPU(atomic_t, tp_array_refs);
static bool try_get_tp_array(void)
{
- atomic_t *refs = this_cpu_ptr(&int3_refs);
+ atomic_t *refs = this_cpu_ptr(&tp_array_refs);
if (!raw_atomic_inc_not_zero(refs))
return false;
@@ -2487,7 +2487,7 @@ static bool try_get_tp_array(void)
static __always_inline void put_tp_array(void)
{
- atomic_t *refs = this_cpu_ptr(&int3_refs);
+ atomic_t *refs = this_cpu_ptr(&tp_array_refs);
smp_mb__before_atomic();
raw_atomic_dec(refs);
@@ -2522,9 +2522,9 @@ noinstr int text_poke_int3_handler(struct pt_regs *regs)
* Having observed our INT3 instruction, we now must observe
* tp_array with non-zero refcount:
*
- * int3_refs = 1 INT3
+ * tp_array_refs = 1 INT3
* WMB RMB
- * write INT3 if (int3_refs != 0)
+ * write INT3 if (tp_array_refs != 0)
*/
smp_rmb();
@@ -2623,7 +2623,7 @@ static void text_poke_int3_batch_process(void)
* ensure reading a non-zero refcount provides up to date tp_array data.
*/
for_each_possible_cpu(i)
- atomic_set_release(per_cpu_ptr(&int3_refs, i), 1);
+ atomic_set_release(per_cpu_ptr(&tp_array_refs, i), 1);
/*
* Function tracing can enable thousands of places that need to be
@@ -2745,7 +2745,7 @@ static void text_poke_int3_batch_process(void)
* unused.
*/
for_each_possible_cpu(i) {
- atomic_t *refs = per_cpu_ptr(&int3_refs, i);
+ atomic_t *refs = per_cpu_ptr(&tp_array_refs, i);
if (unlikely(!atomic_dec_and_test(refs)))
atomic_cond_read_acquire(refs, !VAL);
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 35/41] x86/alternatives: Move the tp_array manipulation into text_poke_int3_loc_init() and rename it to text_poke_int3_loc_add()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (33 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 34/41] x86/alternatives: Rename 'int3_refs' to 'tp_array_refs' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 36/41] x86/alternatives: Remove the mixed-patching restriction on text_poke_int3() Ingo Molnar
` (6 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
This simplifies the code and code generation a bit:
text data bss dec hex filename
14802 1029 4112 19943 4de7 alternative.o.before
14784 1029 4112 19925 4dd5 alternative.o.after
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 3ab40b0f5245..e1cc3e109feb 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2752,12 +2752,14 @@ static void text_poke_int3_batch_process(void)
}
}
-static void text_poke_int3_loc_init(struct text_poke_int3_loc *tp, void *addr,
- const void *opcode, size_t len, const void *emulate)
+static void text_poke_int3_loc_add(void *addr, const void *opcode, size_t len, const void *emulate)
{
+ struct text_poke_int3_loc *tp;
struct insn insn;
int ret, i = 0;
+ tp = &tp_array.vec[tp_array.nr_entries++];
+
if (len == 6)
i = 1;
memcpy((void *)tp->text, opcode+i, len-i);
@@ -2873,12 +2875,9 @@ static void text_poke_int3_flush(void *addr)
void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, const void *emulate)
{
- struct text_poke_int3_loc *tp;
-
text_poke_int3_flush(addr);
- tp = &tp_array.vec[tp_array.nr_entries++];
- text_poke_int3_loc_init(tp, addr, opcode, len, emulate);
+ text_poke_int3_loc_add(addr, opcode, len, emulate);
}
/**
@@ -2894,13 +2893,10 @@ void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, cons
*/
void __ref text_poke_int3(void *addr, const void *opcode, size_t len, const void *emulate)
{
- struct text_poke_int3_loc *tp;
-
/* Batch-patching should not be mixed with single-patching: */
WARN_ON_ONCE(tp_array.nr_entries != 0);
- tp = &tp_array.vec[tp_array.nr_entries++];
- text_poke_int3_loc_init(tp, addr, opcode, len, emulate);
+ text_poke_int3_loc_add(addr, opcode, len, emulate);
text_poke_int3_finish();
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 36/41] x86/alternatives: Remove the mixed-patching restriction on text_poke_int3()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (34 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 35/41] x86/alternatives: Move the tp_array manipulation into text_poke_int3_loc_init() and rename it to text_poke_int3_loc_add() Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 37/41] x86/alternatives: Rename 'text_poke_int3()' to 'text_poke_int3_now()' Ingo Molnar
` (5 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
At this point text_poke_int3(addr, opcode, len, emulate) is equivalent to:
text_poke_int3_queue(addr, opcode, len, emulate);
text_poke_int3_finish();
So remove the restriction on mixing single-instruction patching
with multi-instruction patching.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index e1cc3e109feb..2807d35c7676 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2893,9 +2893,6 @@ void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, cons
*/
void __ref text_poke_int3(void *addr, const void *opcode, size_t len, const void *emulate)
{
- /* Batch-patching should not be mixed with single-patching: */
- WARN_ON_ONCE(tp_array.nr_entries != 0);
-
text_poke_int3_loc_add(addr, opcode, len, emulate);
text_poke_int3_finish();
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 37/41] x86/alternatives: Rename 'text_poke_int3()' to 'text_poke_int3_now()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (35 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 36/41] x86/alternatives: Remove the mixed-patching restriction on text_poke_int3() Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 38/41] x86/alternatives: Add documentation for text_poke_int3_queue() Ingo Molnar
` (4 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
In the current name it's not obvious that the main difference
between text_poke_int3() and text_poke_int3_queue() is that
text_poke_int3() patches the kernel immediately.
Make this more apparent by renaming it to text_poke_int3_now().
Also extend the documentation to better describe its purpose.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/text-patching.h | 2 +-
arch/x86/kernel/alternative.c | 7 ++++---
arch/x86/kernel/ftrace.c | 8 ++++----
arch/x86/kernel/jump_label.c | 2 +-
arch/x86/kernel/kprobes/opt.c | 2 +-
arch/x86/kernel/static_call.c | 2 +-
arch/x86/net/bpf_jit_comp.c | 2 +-
7 files changed, 13 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index 7deb06aec467..611957617278 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -39,7 +39,7 @@ extern void *text_poke_copy(void *addr, const void *opcode, size_t len);
extern void *text_poke_copy_locked(void *addr, const void *opcode, size_t len, bool core_ok);
extern void *text_poke_set(void *addr, int c, size_t len);
extern int text_poke_int3_handler(struct pt_regs *regs);
-extern void text_poke_int3(void *addr, const void *opcode, size_t len, const void *emulate);
+extern void text_poke_int3_now(void *addr, const void *opcode, size_t len, const void *emulate);
extern void text_poke_int3_queue(void *addr, const void *opcode, size_t len, const void *emulate);
extern void text_poke_int3_finish(void);
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 2807d35c7676..6e2fab1768e2 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2881,7 +2881,7 @@ void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, cons
}
/**
- * text_poke_int3() -- update instructions on live kernel on SMP
+ * text_poke_int3_now() -- update instruction on live kernel on SMP immediately
* @addr: address to patch
* @opcode: opcode of new instruction
* @len: length to copy
@@ -2889,9 +2889,10 @@ void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, cons
*
* Update a single instruction with the vector in the stack, avoiding
* dynamically allocated memory. This function should be used when it is
- * not possible to allocate memory.
+ * not possible to allocate memory for a vector. The single instruction
+ * is patched in immediately.
*/
-void __ref text_poke_int3(void *addr, const void *opcode, size_t len, const void *emulate)
+void __ref text_poke_int3_now(void *addr, const void *opcode, size_t len, const void *emulate)
{
text_poke_int3_loc_add(addr, opcode, len, emulate);
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index ff3cdd08f28f..40b1c218ee86 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -186,11 +186,11 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
ip = (unsigned long)(&ftrace_call);
new = ftrace_call_replace(ip, (unsigned long)func);
- text_poke_int3((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
+ text_poke_int3_now((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
ip = (unsigned long)(&ftrace_regs_call);
new = ftrace_call_replace(ip, (unsigned long)func);
- text_poke_int3((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
+ text_poke_int3_now((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
return 0;
}
@@ -492,7 +492,7 @@ void arch_ftrace_update_trampoline(struct ftrace_ops *ops)
mutex_lock(&text_mutex);
/* Do a safe modify in case the trampoline is executing */
new = ftrace_call_replace(ip, (unsigned long)func);
- text_poke_int3((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
+ text_poke_int3_now((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
mutex_unlock(&text_mutex);
}
@@ -586,7 +586,7 @@ static int ftrace_mod_jmp(unsigned long ip, void *func)
const char *new;
new = ftrace_jmp_replace(ip, (unsigned long)func);
- text_poke_int3((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
+ text_poke_int3_now((void *)ip, new, MCOUNT_INSN_SIZE, NULL);
return 0;
}
diff --git a/arch/x86/kernel/jump_label.c b/arch/x86/kernel/jump_label.c
index f72738e6d7d4..e5b58c81dfaf 100644
--- a/arch/x86/kernel/jump_label.c
+++ b/arch/x86/kernel/jump_label.c
@@ -102,7 +102,7 @@ __jump_label_transform(struct jump_entry *entry,
return;
}
- text_poke_int3((void *)jump_entry_code(entry), jlp.code, jlp.size, NULL);
+ text_poke_int3_now((void *)jump_entry_code(entry), jlp.code, jlp.size, NULL);
}
static void __ref jump_label_transform(struct jump_entry *entry,
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index e13d4a2d9244..54bc5e7c6886 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -488,7 +488,7 @@ void arch_optimize_kprobes(struct list_head *oplist)
insn_buff[0] = JMP32_INSN_OPCODE;
*(s32 *)(&insn_buff[1]) = rel;
- text_poke_int3(op->kp.addr, insn_buff, JMP32_INSN_SIZE, NULL);
+ text_poke_int3_now(op->kp.addr, insn_buff, JMP32_INSN_SIZE, NULL);
list_del_init(&op->list);
}
diff --git a/arch/x86/kernel/static_call.c b/arch/x86/kernel/static_call.c
index 3331a7c90b9a..146cc27848df 100644
--- a/arch/x86/kernel/static_call.c
+++ b/arch/x86/kernel/static_call.c
@@ -108,7 +108,7 @@ static void __ref __static_call_transform(void *insn, enum insn_type type,
if (system_state == SYSTEM_BOOTING || modinit)
return text_poke_early(insn, code, size);
- text_poke_int3(insn, code, size, emulate);
+ text_poke_int3_now(insn, code, size, emulate);
}
static void __static_call_validate(u8 *insn, bool tail, bool tramp)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 1e2a4b7a6b73..8d08c8ff3e50 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -629,7 +629,7 @@ static int __bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
goto out;
ret = 1;
if (memcmp(ip, new_insn, X86_PATCH_SIZE)) {
- text_poke_int3(ip, new_insn, X86_PATCH_SIZE, NULL);
+ text_poke_int3_now(ip, new_insn, X86_PATCH_SIZE, NULL);
ret = 0;
}
out:
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 38/41] x86/alternatives: Add documentation for text_poke_int3_queue()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (36 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 37/41] x86/alternatives: Rename 'text_poke_int3()' to 'text_poke_int3_now()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 39/41] x86/alternatives: Move tp_array completion from text_poke_int3_finish() and text_poke_int3_flush() to text_poke_int3_batch_process() Ingo Molnar
` (3 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 6e2fab1768e2..ba322a29aefd 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2873,6 +2873,19 @@ static void text_poke_int3_flush(void *addr)
}
}
+/**
+ * text_poke_int3_queue() -- update instruction on live kernel on SMP, batched
+ * @addr: address to patch
+ * @opcode: opcode of new instruction
+ * @len: length to copy
+ * @emulate: instruction to be emulated
+ *
+ * Add a new instruction to the current queue of to-be-patched instructions
+ * the kernel maintains. The patching request will not be executed immediately,
+ * but becomes part of an array of patching requests, optimized for batched
+ * execution. All pending patching requests will be executed on the next
+ * text_poke_int3_finish() call.
+ */
void __ref text_poke_int3_queue(void *addr, const void *opcode, size_t len, const void *emulate)
{
text_poke_int3_flush(addr);
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 39/41] x86/alternatives: Move tp_array completion from text_poke_int3_finish() and text_poke_int3_flush() to text_poke_int3_batch_process()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (37 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 38/41] x86/alternatives: Add documentation for text_poke_int3_queue() Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 40/41] x86/alternatives: Rename 'text_poke_sync()' to 'text_poke_sync_each_cpu()' Ingo Molnar
` (2 subsequent siblings)
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Simplifies the code and improves code generation a bit:
text data bss dec hex filename
14769 1017 4112 19898 4dba alternative.o.before
14742 1017 4112 19871 4d9f alternative.o.after
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index ba322a29aefd..1b523496a2f6 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2750,6 +2750,9 @@ static void text_poke_int3_batch_process(void)
if (unlikely(!atomic_dec_and_test(refs)))
atomic_cond_read_acquire(refs, !VAL);
}
+
+ /* They are all completed: */
+ tp_array.nr_entries = 0;
}
static void text_poke_int3_loc_add(void *addr, const void *opcode, size_t len, const void *emulate)
@@ -2857,20 +2860,16 @@ static bool tp_addr_ordered(void *addr)
void text_poke_int3_finish(void)
{
- if (tp_array.nr_entries) {
+ if (tp_array.nr_entries)
text_poke_int3_batch_process();
- tp_array.nr_entries = 0;
- }
}
static void text_poke_int3_flush(void *addr)
{
lockdep_assert_held(&text_mutex);
- if (tp_array.nr_entries == TP_ARRAY_NR_ENTRIES_MAX || !tp_addr_ordered(addr)) {
+ if (tp_array.nr_entries == TP_ARRAY_NR_ENTRIES_MAX || !tp_addr_ordered(addr))
text_poke_int3_batch_process();
- tp_array.nr_entries = 0;
- }
}
/**
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 40/41] x86/alternatives: Rename 'text_poke_sync()' to 'text_poke_sync_each_cpu()'
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (38 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 39/41] x86/alternatives: Move tp_array completion from text_poke_int3_finish() and text_poke_int3_flush() to text_poke_int3_batch_process() Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-04-02 4:10 ` H. Peter Anvin
2025-03-27 20:53 ` [PATCH 41/41] x86/alternatives: Simplify tp_addr_ordered() Ingo Molnar
2025-03-27 22:19 ` [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Linus Torvalds
41 siblings, 1 reply; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
Unlike sync_core(), text_poke_sync() is a very heavy operation, as
it sends an IPI to every online CPU in the system and waits for
completion.
Reflect this in the name.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/include/asm/text-patching.h | 2 +-
arch/x86/kernel/alternative.c | 12 ++++++------
arch/x86/kernel/kprobes/core.c | 4 ++--
arch/x86/kernel/kprobes/opt.c | 4 ++--
arch/x86/kernel/module.c | 2 +-
5 files changed, 12 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
index 611957617278..ff30aa1d0c47 100644
--- a/arch/x86/include/asm/text-patching.h
+++ b/arch/x86/include/asm/text-patching.h
@@ -32,7 +32,7 @@ extern void apply_relocation(u8 *buf, const u8 * const instr, size_t instrlen, u
* an inconsistent instruction while you patch.
*/
extern void *text_poke(void *addr, const void *opcode, size_t len);
-extern void text_poke_sync(void);
+extern void text_poke_sync_each_cpu(void);
extern void *text_poke_kgdb(void *addr, const void *opcode, size_t len);
extern void *text_poke_copy(void *addr, const void *opcode, size_t len);
#define text_poke_copy text_poke_copy
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 1b523496a2f6..32d3707d7963 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2445,7 +2445,7 @@ static void do_sync_core(void *info)
sync_core();
}
-void text_poke_sync(void)
+void text_poke_sync_each_cpu(void)
{
on_each_cpu(do_sync_core, NULL, 1);
}
@@ -2469,8 +2469,8 @@ struct text_poke_int3_loc {
#define TP_ARRAY_NR_ENTRIES_MAX (PAGE_SIZE / sizeof(struct text_poke_int3_loc))
static struct text_poke_int3_array {
- int nr_entries;
struct text_poke_int3_loc vec[TP_ARRAY_NR_ENTRIES_MAX];
+ int nr_entries;
} tp_array;
static DEFINE_PER_CPU(atomic_t, tp_array_refs);
@@ -2649,7 +2649,7 @@ static void text_poke_int3_batch_process(void)
text_poke(text_poke_int3_addr(&tp_array.vec[i]), &int3, INT3_INSN_SIZE);
}
- text_poke_sync();
+ text_poke_sync_each_cpu();
/*
* Second step: update all but the first byte of the patched range.
@@ -2711,7 +2711,7 @@ static void text_poke_int3_batch_process(void)
* not necessary and we'd be safe even without it. But
* better safe than sorry (plus there's not only Intel).
*/
- text_poke_sync();
+ text_poke_sync_each_cpu();
}
/*
@@ -2732,13 +2732,13 @@ static void text_poke_int3_batch_process(void)
}
if (do_sync)
- text_poke_sync();
+ text_poke_sync_each_cpu();
/*
* Remove and wait for refs to be zero.
*
* Notably, if after step-3 above the INT3 got removed, then the
- * text_poke_sync() will have serialized against any running INT3
+ * text_poke_sync_each_cpu() will have serialized against any running INT3
* handlers and the below spin-wait will not happen.
*
* IOW. unless the replacement instruction is INT3, this case goes
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 09608fd93687..5e35c95524dc 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -808,7 +808,7 @@ void arch_arm_kprobe(struct kprobe *p)
u8 int3 = INT3_INSN_OPCODE;
text_poke(p->addr, &int3, 1);
- text_poke_sync();
+ text_poke_sync_each_cpu();
perf_event_text_poke(p->addr, &p->opcode, 1, &int3, 1);
}
@@ -818,7 +818,7 @@ void arch_disarm_kprobe(struct kprobe *p)
perf_event_text_poke(p->addr, &int3, 1, &p->opcode, 1);
text_poke(p->addr, &p->opcode, 1);
- text_poke_sync();
+ text_poke_sync_each_cpu();
}
void arch_remove_kprobe(struct kprobe *p)
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 54bc5e7c6886..5efa7b50bbb3 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -513,11 +513,11 @@ void arch_unoptimize_kprobe(struct optimized_kprobe *op)
JMP32_INSN_SIZE - INT3_INSN_SIZE);
text_poke(addr, new, INT3_INSN_SIZE);
- text_poke_sync();
+ text_poke_sync_each_cpu();
text_poke(addr + INT3_INSN_SIZE,
new + INT3_INSN_SIZE,
JMP32_INSN_SIZE - INT3_INSN_SIZE);
- text_poke_sync();
+ text_poke_sync_each_cpu();
perf_event_text_poke(op->kp.addr, old, JMP32_INSN_SIZE, new, JMP32_INSN_SIZE);
}
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index a7998f351701..1c598c90e24d 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -206,7 +206,7 @@ static int write_relocate_add(Elf64_Shdr *sechdrs,
write, apply);
if (!early) {
- text_poke_sync();
+ text_poke_sync_each_cpu();
mutex_unlock(&text_mutex);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* [PATCH 41/41] x86/alternatives: Simplify tp_addr_ordered()
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (39 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 40/41] x86/alternatives: Rename 'text_poke_sync()' to 'text_poke_sync_each_cpu()' Ingo Molnar
@ 2025-03-27 20:53 ` Ingo Molnar
2025-03-27 22:19 ` [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Linus Torvalds
41 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-27 20:53 UTC (permalink / raw)
To: linux-kernel
Cc: Juergen Gross, H . Peter Anvin, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
- Use direct 'void *' pointer comparison, there's no
need to force the type to 'unsigned long'.
- Remove the 'tp' local variable indirection
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
arch/x86/kernel/alternative.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 32d3707d7963..7367c829a4fb 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2838,8 +2838,6 @@ static void text_poke_int3_loc_add(void *addr, const void *opcode, size_t len, c
*/
static bool tp_addr_ordered(void *addr)
{
- struct text_poke_int3_loc *tp;
-
WARN_ON_ONCE(!addr);
if (!tp_array.nr_entries)
@@ -2851,8 +2849,7 @@ static bool tp_addr_ordered(void *addr)
* is violated and we must first flush all pending patching
* requests:
*/
- tp = &tp_array.vec[tp_array.nr_entries-1];
- if ((unsigned long)text_poke_int3_addr(tp) > (unsigned long)addr)
+ if (text_poke_int3_addr(tp_array.vec + tp_array.nr_entries-1) > addr)
return false;
return true;
--
2.45.2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c)
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
` (40 preceding siblings ...)
2025-03-27 20:53 ` [PATCH 41/41] x86/alternatives: Simplify tp_addr_ordered() Ingo Molnar
@ 2025-03-27 22:19 ` Linus Torvalds
2025-03-28 10:10 ` Ingo Molnar
2025-04-01 14:55 ` Peter Zijlstra
41 siblings, 2 replies; 47+ messages in thread
From: Linus Torvalds @ 2025-03-27 22:19 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, Juergen Gross, H . Peter Anvin, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
On Thu, 27 Mar 2025 at 13:54, Ingo Molnar <mingo@kernel.org> wrote:
>
> The second part of the series simplifies and standardizes the SMP batch-patching
> data & types namespace, around the new tp_array* namespace:
>
> int3_patching_desc => [removed]
> temp_mm_state_t => [removed]
> try_get_desc() => [removed]
> put_desc() => [removed]
>
> tp_vec,tp_vec_nr => tp_array
> int3_refs => tp_array_refs
Honestly, I think "int3" is better than "tp" as a part of the name.
"tp" doesn't say _anything_ to me, even though I understand it is
short for "text poke". But if you want to say 'text_poke", please just
write it out.
At least "int3" has some meaning in x86 context, unlike "tp".
So please either write out "text_poke" and accept that the names are a
bit longer (but a lot more descriptive), or use "int3" if you want to
save some typing.
Linus
PS. The casual meaning "tp" has in English everyday language is short
for "toilet paper".
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c)
2025-03-27 22:19 ` [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Linus Torvalds
@ 2025-03-28 10:10 ` Ingo Molnar
2025-04-01 14:55 ` Peter Zijlstra
1 sibling, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-03-28 10:10 UTC (permalink / raw)
To: Linus Torvalds
Cc: linux-kernel, Juergen Gross, H . Peter Anvin, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Thu, 27 Mar 2025 at 13:54, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > The second part of the series simplifies and standardizes the SMP batch-patching
> > data & types namespace, around the new tp_array* namespace:
> >
> > int3_patching_desc => [removed]
> > temp_mm_state_t => [removed]
> > try_get_desc() => [removed]
> > put_desc() => [removed]
> >
> > tp_vec,tp_vec_nr => tp_array
> > int3_refs => tp_array_refs
>
> Honestly, I think "int3" is better than "tp" as a part of the name.
Yeah, was thinking hard about those two as well, but skipped it for
this series, because of the brevity issue: text_poke_int3_ is quite a
mouthful for a commonly used global state variable.
> "tp" doesn't say _anything_ to me, even though I understand it is
> short for "text poke". But if you want to say 'text_poke", please
> just write it out.
>
> At least "int3" has some meaning in x86 context, unlike "tp".
>
> So please either write out "text_poke" and accept that the names are
> a bit longer (but a lot more descriptive), or use "int3" if you want
> to save some typing.
Yeah.
So the thing is, the whole _int3 naming itself is a bit artificial
IMHO, what we *really* want to signal here is whether something is
boot/UP functionality or SMP functionality under the text_mutex.
That the SMP functionality relies on INT3 traps is an implementational
detail that isn't necessary to show up in the API namespace. It also
relies on CR3 flushing, so we could as well have added _cr3. ;-)
So I was thinking about something like this for the boot/UP variants:
text_poke_*()
and for the SMP variants:
smp_text_poke_*()
and text_poke_* for the data/type space.
Plus I think with the adding of 'smp_' we can also add 'batch_' to a
few APIs to make the family of APIs clearer, plus a few other things:
A quick summary of changes (mockup):
# boot/UP APIs & single-thread helpers:
text_poke()
text_poke_kgdb()
[ unchanged APIs: ] text_poke_copy()
text_poke_copy_locked()
text_poke_set()
text_poke_addr()
# SMP API & helpers namespace:
text_poke_bp() => smp_text_poke_single()
text_poke_loc_init() => __smp_text_poke_batch_add()
text_poke_queue() => smp_text_poke_batch_add()
text_poke_finish() => smp_text_poke_batch_flush()
text_poke_flush() => smp_text_poke_batch_finish()
text_poke_bp_batch() => smp_text_poke_batch_process()
poke_int3_handler() => smp_text_poke_int3_trap_handler()
text_poke_sync() => smp_text_poke_sync_each_cpu()
# data/type namespace:
int3_patching_desc => [removed]
temp_mm_state_t => [removed]
try_get_desc() => [removed]
put_desc() => [removed]
tp_vec,tp_vec_nr => text_poke_array
int3_refs => text_poke_array_refs
Some of the API names are now a bit long, but I think this is one of
the cases where clarity is more important than brevity, plus these are
usually used in a pretty compact, straightforward fashion to trigger
text-patching processing, not part of complex compound expressions.
I'll propagate this nomenclature into the series and repost.
> PS. The casual meaning "tp" has in English everyday language is short
> for "toilet paper".
LOL, this seals the deal, the tp_ prefix is *so* dead.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c)
2025-03-27 22:19 ` [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Linus Torvalds
2025-03-28 10:10 ` Ingo Molnar
@ 2025-04-01 14:55 ` Peter Zijlstra
1 sibling, 0 replies; 47+ messages in thread
From: Peter Zijlstra @ 2025-04-01 14:55 UTC (permalink / raw)
To: Linus Torvalds
Cc: Ingo Molnar, linux-kernel, Juergen Gross, H . Peter Anvin,
Borislav Petkov, Thomas Gleixner
On Thu, Mar 27, 2025 at 03:19:40PM -0700, Linus Torvalds wrote:
> PS. The casual meaning "tp" has in English everyday language is short
> for "toilet paper".
Yes, which is why the old names were awesome :-) /me runs
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 40/41] x86/alternatives: Rename 'text_poke_sync()' to 'text_poke_sync_each_cpu()'
2025-03-27 20:53 ` [PATCH 40/41] x86/alternatives: Rename 'text_poke_sync()' to 'text_poke_sync_each_cpu()' Ingo Molnar
@ 2025-04-02 4:10 ` H. Peter Anvin
2025-04-03 15:05 ` Ingo Molnar
0 siblings, 1 reply; 47+ messages in thread
From: H. Peter Anvin @ 2025-04-02 4:10 UTC (permalink / raw)
To: Ingo Molnar, linux-kernel
Cc: Juergen Gross, Linus Torvalds, Peter Zijlstra, Borislav Petkov,
Thomas Gleixner
On March 27, 2025 1:53:53 PM PDT, Ingo Molnar <mingo@kernel.org> wrote:
>Unlike sync_core(), text_poke_sync() is a very heavy operation, as
>it sends an IPI to every online CPU in the system and waits for
>completion.
>
>Reflect this in the name.
>
>Signed-off-by: Ingo Molnar <mingo@kernel.org>
>---
> arch/x86/include/asm/text-patching.h | 2 +-
> arch/x86/kernel/alternative.c | 12 ++++++------
> arch/x86/kernel/kprobes/core.c | 4 ++--
> arch/x86/kernel/kprobes/opt.c | 4 ++--
> arch/x86/kernel/module.c | 2 +-
> 5 files changed, 12 insertions(+), 12 deletions(-)
>
>diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h
>index 611957617278..ff30aa1d0c47 100644
>--- a/arch/x86/include/asm/text-patching.h
>+++ b/arch/x86/include/asm/text-patching.h
>@@ -32,7 +32,7 @@ extern void apply_relocation(u8 *buf, const u8 * const instr, size_t instrlen, u
> * an inconsistent instruction while you patch.
> */
> extern void *text_poke(void *addr, const void *opcode, size_t len);
>-extern void text_poke_sync(void);
>+extern void text_poke_sync_each_cpu(void);
> extern void *text_poke_kgdb(void *addr, const void *opcode, size_t len);
> extern void *text_poke_copy(void *addr, const void *opcode, size_t len);
> #define text_poke_copy text_poke_copy
>diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
>index 1b523496a2f6..32d3707d7963 100644
>--- a/arch/x86/kernel/alternative.c
>+++ b/arch/x86/kernel/alternative.c
>@@ -2445,7 +2445,7 @@ static void do_sync_core(void *info)
> sync_core();
> }
>
>-void text_poke_sync(void)
>+void text_poke_sync_each_cpu(void)
> {
> on_each_cpu(do_sync_core, NULL, 1);
> }
>@@ -2469,8 +2469,8 @@ struct text_poke_int3_loc {
> #define TP_ARRAY_NR_ENTRIES_MAX (PAGE_SIZE / sizeof(struct text_poke_int3_loc))
>
> static struct text_poke_int3_array {
>- int nr_entries;
> struct text_poke_int3_loc vec[TP_ARRAY_NR_ENTRIES_MAX];
>+ int nr_entries;
> } tp_array;
>
> static DEFINE_PER_CPU(atomic_t, tp_array_refs);
>@@ -2649,7 +2649,7 @@ static void text_poke_int3_batch_process(void)
> text_poke(text_poke_int3_addr(&tp_array.vec[i]), &int3, INT3_INSN_SIZE);
> }
>
>- text_poke_sync();
>+ text_poke_sync_each_cpu();
>
> /*
> * Second step: update all but the first byte of the patched range.
>@@ -2711,7 +2711,7 @@ static void text_poke_int3_batch_process(void)
> * not necessary and we'd be safe even without it. But
> * better safe than sorry (plus there's not only Intel).
> */
>- text_poke_sync();
>+ text_poke_sync_each_cpu();
> }
>
> /*
>@@ -2732,13 +2732,13 @@ static void text_poke_int3_batch_process(void)
> }
>
> if (do_sync)
>- text_poke_sync();
>+ text_poke_sync_each_cpu();
>
> /*
> * Remove and wait for refs to be zero.
> *
> * Notably, if after step-3 above the INT3 got removed, then the
>- * text_poke_sync() will have serialized against any running INT3
>+ * text_poke_sync_each_cpu() will have serialized against any running INT3
> * handlers and the below spin-wait will not happen.
> *
> * IOW. unless the replacement instruction is INT3, this case goes
>diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
>index 09608fd93687..5e35c95524dc 100644
>--- a/arch/x86/kernel/kprobes/core.c
>+++ b/arch/x86/kernel/kprobes/core.c
>@@ -808,7 +808,7 @@ void arch_arm_kprobe(struct kprobe *p)
> u8 int3 = INT3_INSN_OPCODE;
>
> text_poke(p->addr, &int3, 1);
>- text_poke_sync();
>+ text_poke_sync_each_cpu();
> perf_event_text_poke(p->addr, &p->opcode, 1, &int3, 1);
> }
>
>@@ -818,7 +818,7 @@ void arch_disarm_kprobe(struct kprobe *p)
>
> perf_event_text_poke(p->addr, &int3, 1, &p->opcode, 1);
> text_poke(p->addr, &p->opcode, 1);
>- text_poke_sync();
>+ text_poke_sync_each_cpu();
> }
>
> void arch_remove_kprobe(struct kprobe *p)
>diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
>index 54bc5e7c6886..5efa7b50bbb3 100644
>--- a/arch/x86/kernel/kprobes/opt.c
>+++ b/arch/x86/kernel/kprobes/opt.c
>@@ -513,11 +513,11 @@ void arch_unoptimize_kprobe(struct optimized_kprobe *op)
> JMP32_INSN_SIZE - INT3_INSN_SIZE);
>
> text_poke(addr, new, INT3_INSN_SIZE);
>- text_poke_sync();
>+ text_poke_sync_each_cpu();
> text_poke(addr + INT3_INSN_SIZE,
> new + INT3_INSN_SIZE,
> JMP32_INSN_SIZE - INT3_INSN_SIZE);
>- text_poke_sync();
>+ text_poke_sync_each_cpu();
>
> perf_event_text_poke(op->kp.addr, old, JMP32_INSN_SIZE, new, JMP32_INSN_SIZE);
> }
>diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
>index a7998f351701..1c598c90e24d 100644
>--- a/arch/x86/kernel/module.c
>+++ b/arch/x86/kernel/module.c
>@@ -206,7 +206,7 @@ static int write_relocate_add(Elf64_Shdr *sechdrs,
> write, apply);
>
> if (!early) {
>- text_poke_sync();
>+ text_poke_sync_each_cpu();
> mutex_unlock(&text_mutex);
> }
>
Is that the only use case we have for syncing all CPUs?
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [PATCH 40/41] x86/alternatives: Rename 'text_poke_sync()' to 'text_poke_sync_each_cpu()'
2025-04-02 4:10 ` H. Peter Anvin
@ 2025-04-03 15:05 ` Ingo Molnar
0 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2025-04-03 15:05 UTC (permalink / raw)
To: H. Peter Anvin
Cc: linux-kernel, Juergen Gross, Linus Torvalds, Peter Zijlstra,
Borislav Petkov, Thomas Gleixner
* H. Peter Anvin <hpa@zytor.com> wrote:
> Is that the only use case we have for syncing all CPUs?
So there's:
- kernel/sched/membarrier.c's ipi_sync_core() which does sync_core(),
but it's embedded into a larger array of IPI handlers, called via
smp_call_function_many()/on_each_cpu_mask(), so I don't think
functionality can be shared there.
- there's arch/x86/kernel/cpu/mce/core.c's machine_check_poll() handler
and the kill_me_maybe() function, but these are single thread.
- then there's arch/x86/kernel/static_call.c's
__static_call_update_early(), used indirectly by
arch/x86/xen/enlighten.c and arch/x86/xen/enlighten_pv.c, but these
are single-threaded too due to being early boot code.
So not much I think - at least what I've managed to find.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2025-04-03 15:05 UTC | newest]
Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-27 20:53 [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Ingo Molnar
2025-03-27 20:53 ` [PATCH 01/41] x86/alternatives: Rename 'struct bp_patching_desc' to 'struct int3_patching_desc' Ingo Molnar
2025-03-27 20:53 ` [PATCH 02/41] x86/alternatives: Rename 'bp_refs' to 'int3_refs' Ingo Molnar
2025-03-27 20:53 ` [PATCH 03/41] x86/alternatives: Rename 'text_poke_bp_batch()' to 'text_poke_int3_batch()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 04/41] x86/alternatives: Rename 'text_poke_bp()' to 'text_poke_int3()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 05/41] x86/alternatives: Rename 'poke_int3_handler()' to 'text_poke_int3_handler()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 06/41] x86/alternatives: Rename 'poking_mm' to 'text_poke_mm' Ingo Molnar
2025-03-27 20:53 ` [PATCH 07/41] x86/alternatives: Rename 'text_poke_addr' to 'text_poke_int3_addr' Ingo Molnar
2025-03-27 20:53 ` [PATCH 08/41] x86/alternatives: Rename 'poking_addr' to 'text_poke_addr' Ingo Molnar
2025-03-27 20:53 ` [PATCH 09/41] x86/alternatives: Rename 'bp_desc' to 'int3_desc' Ingo Molnar
2025-03-27 20:53 ` [PATCH 10/41] x86/alternatives: Remove duplicate 'text_poke_early()' prototype Ingo Molnar
2025-03-27 20:53 ` [PATCH 11/41] x86/alternatives: Update comments in int3_emulate_push() Ingo Molnar
2025-03-27 20:53 ` [PATCH 12/41] x86/alternatives: Remove the confusing, inaccurate & unnecessary 'temp_mm_state_t' abstraction Ingo Molnar
2025-03-27 20:53 ` [PATCH 13/41] x86/alternatives: Rename 'text_poke_flush()' to 'text_poke_int3_flush()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 14/41] x86/alternatives: Rename 'text_poke_finish()' to 'text_poke_int3_finish()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 15/41] x86/alternatives: Rename 'text_poke_queue()' to 'text_poke_int3_queue()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 16/41] x86/alternatives: Rename 'text_poke_loc_init()' to 'text_poke_int3_loc_init()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 17/41] x86/alternatives: Rename 'struct text_poke_loc' to 'struct text_poke_int3_loc' Ingo Molnar
2025-03-27 20:53 ` [PATCH 18/41] x86/alternatives: Rename 'struct int3_patching_desc' to 'struct text_poke_int3_vec' Ingo Molnar
2025-03-27 20:53 ` [PATCH 19/41] x86/alternatives: Rename 'int3_desc' to 'int3_vec' Ingo Molnar
2025-03-27 20:53 ` [PATCH 20/41] x86/alternatives: Add text_mutex) assert to text_poke_int3_flush() Ingo Molnar
2025-03-27 20:53 ` [PATCH 21/41] x86/alternatives: Assert that text_poke_int3_handler() can only ever handle 'tp_vec[]' based requests Ingo Molnar
2025-03-27 20:53 ` [PATCH 22/41] x86/alternatives: Use non-inverted logic instead of 'tp_order_fail()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 23/41] x86/alternatives: Remove the 'addr == NULL means forced-flush' hack from text_poke_int3_finish()/text_poke_int3_flush()/tp_addr_ordered() Ingo Molnar
2025-03-27 20:53 ` [PATCH 24/41] x86/alternatives: Simplify text_poke_int3() by using tp_vec and existing APIs Ingo Molnar
2025-03-27 20:53 ` [PATCH 25/41] x86/alternatives: Assert input parameters in text_poke_int3_batch() Ingo Molnar
2025-03-27 20:53 ` [PATCH 26/41] x86/alternatives: Introduce 'struct text_poke_int3_array' and move tp_vec and tp_vec_nr to it Ingo Molnar
2025-03-27 20:53 ` [PATCH 27/41] x86/alternatives: Remove the tp_vec indirection Ingo Molnar
2025-03-27 20:53 ` [PATCH 28/41] x86/alternatives: Rename 'try_get_desc()' to 'try_get_tp_array()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 29/41] x86/alternatives: Rename 'put_desc()' to 'put_tp_array()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 30/41] x86/alternatives: Simplify try_get_tp_array() Ingo Molnar
2025-03-27 20:53 ` [PATCH 31/41] x86/alternatives: Simplify text_poke_int3_handler() Ingo Molnar
2025-03-27 20:53 ` [PATCH 32/41] x86/alternatives: Simplify text_poke_int3_batch() Ingo Molnar
2025-03-27 20:53 ` [PATCH 33/41] x86/alternatives: Rename 'text_poke_int3_batch()' to 'text_poke_int3_batch_process()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 34/41] x86/alternatives: Rename 'int3_refs' to 'tp_array_refs' Ingo Molnar
2025-03-27 20:53 ` [PATCH 35/41] x86/alternatives: Move the tp_array manipulation into text_poke_int3_loc_init() and rename it to text_poke_int3_loc_add() Ingo Molnar
2025-03-27 20:53 ` [PATCH 36/41] x86/alternatives: Remove the mixed-patching restriction on text_poke_int3() Ingo Molnar
2025-03-27 20:53 ` [PATCH 37/41] x86/alternatives: Rename 'text_poke_int3()' to 'text_poke_int3_now()' Ingo Molnar
2025-03-27 20:53 ` [PATCH 38/41] x86/alternatives: Add documentation for text_poke_int3_queue() Ingo Molnar
2025-03-27 20:53 ` [PATCH 39/41] x86/alternatives: Move tp_array completion from text_poke_int3_finish() and text_poke_int3_flush() to text_poke_int3_batch_process() Ingo Molnar
2025-03-27 20:53 ` [PATCH 40/41] x86/alternatives: Rename 'text_poke_sync()' to 'text_poke_sync_each_cpu()' Ingo Molnar
2025-04-02 4:10 ` H. Peter Anvin
2025-04-03 15:05 ` Ingo Molnar
2025-03-27 20:53 ` [PATCH 41/41] x86/alternatives: Simplify tp_addr_ordered() Ingo Molnar
2025-03-27 22:19 ` [PATCH 00/41] Simplify, reorganize and clean up the x86 INT3 based batch-patching code (alternative.c) Linus Torvalds
2025-03-28 10:10 ` Ingo Molnar
2025-04-01 14:55 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox