* [PATCH 0/3] Fix and unify call thunks assembly snippets
@ 2023-11-02 11:25 Uros Bizjak
2023-11-02 11:25 ` [PATCH 1/3] x86/callthunks: Move call thunk template to .S file Uros Bizjak
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Uros Bizjak @ 2023-11-02 11:25 UTC (permalink / raw)
To: x86, linux-kernel
Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, Peter Zijlstra
Currently INCREMENT_CALL_DEPTH and thunk debug macros explicitly
define %gs: segment register prefix for their percpu variables.
This is not compatible with !CONFIG_SMP, which requires non-prefixed
percpu variables.
Contrary to alternatives, relocations are currently not supported in
call thunk templates. Support for relocations will be needed when
PER_CPU_VAR macro switches to %rip-relative addressing.
Due to unsupported relocations, two variants of INCREMENT_CALL_DEPTH
macro are needed, ASM_ prefixed that allows relocations and
non-prefixed version that allows only absolute addresses.
The following patch series fixes above issues by
a) Moving call thunk template to its own callthunks-tmpl.S assembly file
where PER_CPU_VAR macro from percpu.h can be used to conditionally
use %gs: segment register prefix, depending on CONFIG_SMP.
b) Implementing minimal support for relocations when copying call thunk
template from its storage location to handle %rip-relative addresses.
c) Fixing call thunks debug macros to use PER_CPU_VAR macro from
percpu.h to conditionally use %gs: segment register prefix, depending
on CONFIG_SMP.
d) Unifying ASM_ prefixed assembly macros with their non-prefixed
variants. With support of %rip-relative relocations in place, call
thunk templates allow %rip-relative addressing, so unified assembly
snippet can be used everywhere.
The patch is independent of main percpu series in -tip tree.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Uros Bizjak (3):
x86/callthunks: Move call thunk template to .S file
x86/callthunks: Handle %rip-relative relocations in call thunk
template
x86/callthunks: Fix and unify call thunks assembly snippets
arch/x86/include/asm/nospec-branch.h | 23 +++------
arch/x86/kernel/Makefile | 2 +-
arch/x86/kernel/callthunks-tmpl.S | 11 +++++
arch/x86/kernel/callthunks.c | 73 +++++++++++++++++++++-------
4 files changed, 75 insertions(+), 34 deletions(-)
create mode 100644 arch/x86/kernel/callthunks-tmpl.S
--
2.41.0
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 1/3] x86/callthunks: Move call thunk template to .S file 2023-11-02 11:25 [PATCH 0/3] Fix and unify call thunks assembly snippets Uros Bizjak @ 2023-11-02 11:25 ` Uros Bizjak 2023-11-02 11:25 ` [PATCH 2/3] x86/callthunks: Handle %rip-relative relocations in call thunk template Uros Bizjak 2023-11-02 11:25 ` [PATCH 3/3] x86/callthunks: Fix and unify call thunks assembly snippets Uros Bizjak 2 siblings, 0 replies; 9+ messages in thread From: Uros Bizjak @ 2023-11-02 11:25 UTC (permalink / raw) To: x86, linux-kernel Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra Currently INCREMENT_CALL_DEPTH explicitly defines %gs: segment register prefix for its percpu variable. This is not compatible with !CONFIG_SMP, which requires non-prefixed percpu variables. Move call thunk template to its own callthunks-tmpl.S assembly file where PER_CPU_VAR macro from percpu.h can be used to conditionally use %gs: segment register prefix, depending on CONFIG_SMP. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> --- arch/x86/kernel/Makefile | 2 +- arch/x86/kernel/callthunks-tmpl.S | 11 +++++++++++ arch/x86/kernel/callthunks.c | 10 ---------- 3 files changed, 12 insertions(+), 11 deletions(-) create mode 100644 arch/x86/kernel/callthunks-tmpl.S diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 3269a0e23d3a..6b6b68ef4c3b 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -143,7 +143,7 @@ obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev.o obj-$(CONFIG_CFI_CLANG) += cfi.o -obj-$(CONFIG_CALL_THUNKS) += callthunks.o +obj-$(CONFIG_CALL_THUNKS) += callthunks.o callthunks-tmpl.o obj-$(CONFIG_X86_CET) += cet.o diff --git a/arch/x86/kernel/callthunks-tmpl.S b/arch/x86/kernel/callthunks-tmpl.S new file mode 100644 index 000000000000..e82c473bd1b1 --- /dev/null +++ b/arch/x86/kernel/callthunks-tmpl.S @@ -0,0 +1,11 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include <asm/nospec-branch.h> + + .section .rodata + .global skl_call_thunk_template + .global skl_call_thunk_tail + +skl_call_thunk_template: + INCREMENT_CALL_DEPTH +skl_call_thunk_tail: diff --git a/arch/x86/kernel/callthunks.c b/arch/x86/kernel/callthunks.c index e9ad518a5003..d0922cf94c90 100644 --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -62,16 +62,6 @@ static const struct core_text builtin_coretext = { .name = "builtin", }; -asm ( - ".pushsection .rodata \n" - ".global skl_call_thunk_template \n" - "skl_call_thunk_template: \n" - __stringify(INCREMENT_CALL_DEPTH)" \n" - ".global skl_call_thunk_tail \n" - "skl_call_thunk_tail: \n" - ".popsection \n" -); - extern u8 skl_call_thunk_template[]; extern u8 skl_call_thunk_tail[]; -- 2.41.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] x86/callthunks: Handle %rip-relative relocations in call thunk template 2023-11-02 11:25 [PATCH 0/3] Fix and unify call thunks assembly snippets Uros Bizjak 2023-11-02 11:25 ` [PATCH 1/3] x86/callthunks: Move call thunk template to .S file Uros Bizjak @ 2023-11-02 11:25 ` Uros Bizjak 2023-11-02 11:44 ` Peter Zijlstra 2023-11-02 11:25 ` [PATCH 3/3] x86/callthunks: Fix and unify call thunks assembly snippets Uros Bizjak 2 siblings, 1 reply; 9+ messages in thread From: Uros Bizjak @ 2023-11-02 11:25 UTC (permalink / raw) To: x86, linux-kernel Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra Contrary to alternatives, relocations are currently not supported in call thunk templates. Implement minimal support for relocations when copying template from its storage location to handle %rip-relative addresses. Support for relocations will be needed when PER_CPU_VAR macro switches to %rip-relative addressing. The patch allows unification of ASM_INCREMENT_CALL_DEPTH, which already uses PER_CPU_VAR macro, with INCREMENT_CALL_DEPTH, used in call thunk template, which is currently limited to use absolute address. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> --- arch/x86/kernel/callthunks.c | 63 ++++++++++++++++++++++++++++++++---- 1 file changed, 56 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/callthunks.c b/arch/x86/kernel/callthunks.c index d0922cf94c90..bda09d82bff7 100644 --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -24,6 +24,8 @@ static int __initdata_or_module debug_callthunks; +#define MAX_PATCH_LEN (255-1) + #define prdbg(fmt, args...) \ do { \ if (debug_callthunks) \ @@ -166,13 +168,51 @@ static const u8 nops[] = { 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, }; +#define apply_reloc_n(n_, p_, d_) \ + do { \ + s32 v = *(s##n_ *)(p_); \ + v += (d_); \ + BUG_ON((v >> 31) != (v >> (n_-1))); \ + *(s##n_ *)(p_) = (s##n_)v; \ + } while (0) + +static __always_inline +void apply_reloc(int n, void *ptr, uintptr_t diff) +{ + switch (n) { + case 4: apply_reloc_n(32, ptr, diff); break; + default: BUG(); + } +} + +static void apply_relocation(u8 *buf, size_t len, u8 *dest, u8 *src) +{ + for (int next, i = 0; i < len; i = next) { + struct insn insn; + + if (WARN_ON_ONCE(insn_decode_kernel(&insn, &buf[i]))) + return; + + next = i + insn.length; + + if (insn_rip_relative(&insn)) + apply_reloc(insn.displacement.nbytes, + buf + i + insn_offset_displacement(&insn), + src - dest); + } +} + static void *patch_dest(void *dest, bool direct) { unsigned int tsize = SKL_TMPL_SIZE; + u8 insn_buff[MAX_PATCH_LEN]; u8 *pad = dest - tsize; + memcpy(insn_buff, skl_call_thunk_template, tsize); + apply_relocation(insn_buff, tsize, pad, skl_call_thunk_template); + /* Already patched? */ - if (!bcmp(pad, skl_call_thunk_template, tsize)) + if (!bcmp(pad, insn_buff, tsize)) return pad; /* Ensure there are nops */ @@ -182,9 +222,9 @@ static void *patch_dest(void *dest, bool direct) } if (direct) - memcpy(pad, skl_call_thunk_template, tsize); + memcpy(pad, insn_buff, tsize); else - text_poke_copy_locked(pad, skl_call_thunk_template, tsize, true); + text_poke_copy_locked(pad, insn_buff, tsize, true); return pad; } @@ -281,20 +321,26 @@ void *callthunks_translate_call_dest(void *dest) static bool is_callthunk(void *addr) { unsigned int tmpl_size = SKL_TMPL_SIZE; - void *tmpl = skl_call_thunk_template; + u8 insn_buff[MAX_PATCH_LEN]; unsigned long dest; + u8 *pad; dest = roundup((unsigned long)addr, CONFIG_FUNCTION_ALIGNMENT); if (!thunks_initialized || skip_addr((void *)dest)) return false; - return !bcmp((void *)(dest - tmpl_size), tmpl, tmpl_size); + *pad = dest - tmpl_size; + + memcpy(insn_buff, skl_call_thunk_template, tmpl_size); + apply_relocation(insn_buff, tmpl_size, pad, skl_call_thunk_template); + + return !bcmp(pad, insn_buff, tmpl_size); } int x86_call_depth_emit_accounting(u8 **pprog, void *func) { unsigned int tmpl_size = SKL_TMPL_SIZE; - void *tmpl = skl_call_thunk_template; + u8 insn_buff[MAX_PATCH_LEN]; if (!thunks_initialized) return 0; @@ -303,7 +349,10 @@ int x86_call_depth_emit_accounting(u8 **pprog, void *func) if (func && is_callthunk(func)) return 0; - memcpy(*pprog, tmpl, tmpl_size); + memcpy(insn_buff, skl_call_thunk_template, tmpl_size); + apply_relocation(insn_buff, tmpl_size, *pprog, skl_call_thunk_template); + + memcpy(*pprog, insn_buff, tmpl_size); *pprog += tmpl_size; return tmpl_size; } -- 2.41.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] x86/callthunks: Handle %rip-relative relocations in call thunk template 2023-11-02 11:25 ` [PATCH 2/3] x86/callthunks: Handle %rip-relative relocations in call thunk template Uros Bizjak @ 2023-11-02 11:44 ` Peter Zijlstra 2023-11-02 11:50 ` Uros Bizjak 0 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2023-11-02 11:44 UTC (permalink / raw) To: Uros Bizjak Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin On Thu, Nov 02, 2023 at 12:25:47PM +0100, Uros Bizjak wrote: > @@ -166,13 +168,51 @@ static const u8 nops[] = { > 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, > }; > > +#define apply_reloc_n(n_, p_, d_) \ > + do { \ > + s32 v = *(s##n_ *)(p_); \ > + v += (d_); \ > + BUG_ON((v >> 31) != (v >> (n_-1))); \ > + *(s##n_ *)(p_) = (s##n_)v; \ > + } while (0) > + > +static __always_inline > +void apply_reloc(int n, void *ptr, uintptr_t diff) > +{ > + switch (n) { > + case 4: apply_reloc_n(32, ptr, diff); break; > + default: BUG(); > + } > +} > + > +static void apply_relocation(u8 *buf, size_t len, u8 *dest, u8 *src) > +{ > + for (int next, i = 0; i < len; i = next) { > + struct insn insn; > + > + if (WARN_ON_ONCE(insn_decode_kernel(&insn, &buf[i]))) > + return; > + > + next = i + insn.length; > + > + if (insn_rip_relative(&insn)) > + apply_reloc(insn.displacement.nbytes, > + buf + i + insn_offset_displacement(&insn), > + src - dest); > + } > +} Isn't it simpler to use apply_relocation() from alternative.c? Remove static, add decl, stuff like that? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] x86/callthunks: Handle %rip-relative relocations in call thunk template 2023-11-02 11:44 ` Peter Zijlstra @ 2023-11-02 11:50 ` Uros Bizjak 2023-11-02 11:56 ` Peter Zijlstra 0 siblings, 1 reply; 9+ messages in thread From: Uros Bizjak @ 2023-11-02 11:50 UTC (permalink / raw) To: Peter Zijlstra Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin On Thu, Nov 2, 2023 at 12:44 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Nov 02, 2023 at 12:25:47PM +0100, Uros Bizjak wrote: > > > @@ -166,13 +168,51 @@ static const u8 nops[] = { > > 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, 0x90, > > }; > > > > +#define apply_reloc_n(n_, p_, d_) \ > > + do { \ > > + s32 v = *(s##n_ *)(p_); \ > > + v += (d_); \ > > + BUG_ON((v >> 31) != (v >> (n_-1))); \ > > + *(s##n_ *)(p_) = (s##n_)v; \ > > + } while (0) > > + > > +static __always_inline > > +void apply_reloc(int n, void *ptr, uintptr_t diff) > > +{ > > + switch (n) { > > + case 4: apply_reloc_n(32, ptr, diff); break; > > + default: BUG(); > > + } > > +} > > + > > +static void apply_relocation(u8 *buf, size_t len, u8 *dest, u8 *src) > > +{ > > + for (int next, i = 0; i < len; i = next) { > > + struct insn insn; > > + > > + if (WARN_ON_ONCE(insn_decode_kernel(&insn, &buf[i]))) > > + return; > > + > > + next = i + insn.length; > > + > > + if (insn_rip_relative(&insn)) > > + apply_reloc(insn.displacement.nbytes, > > + buf + i + insn_offset_displacement(&insn), > > + src - dest); > > + } > > +} > > Isn't it simpler to use apply_relocation() from alternative.c? Yes, I was looking at that function, but somehow thought that it is a bit overkill here, since we just need a %rip-relative reloc. > Remove static, add decl, stuff like that? On second thought, you are right. Should I move the above function somewhere (reloc.c?) , or can I just use it from alternative.c and add decl (where?) ? Thanks, Uros. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] x86/callthunks: Handle %rip-relative relocations in call thunk template 2023-11-02 11:50 ` Uros Bizjak @ 2023-11-02 11:56 ` Peter Zijlstra 2023-11-02 12:34 ` Uros Bizjak 0 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2023-11-02 11:56 UTC (permalink / raw) To: Uros Bizjak Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin On Thu, Nov 02, 2023 at 12:50:01PM +0100, Uros Bizjak wrote: > > Remove static, add decl, stuff like that? > > On second thought, you are right. Should I move the above function > somewhere (reloc.c?) , or can I just use it from alternative.c and add > decl (where?) ? Yeah, leave it there for now, perhaps asm/text-patching.h ? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/3] x86/callthunks: Handle %rip-relative relocations in call thunk template 2023-11-02 11:56 ` Peter Zijlstra @ 2023-11-02 12:34 ` Uros Bizjak 0 siblings, 0 replies; 9+ messages in thread From: Uros Bizjak @ 2023-11-02 12:34 UTC (permalink / raw) To: Peter Zijlstra Cc: x86, linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin [-- Attachment #1: Type: text/plain, Size: 511 bytes --] On Thu, Nov 2, 2023 at 12:56 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Nov 02, 2023 at 12:50:01PM +0100, Uros Bizjak wrote: > > > > Remove static, add decl, stuff like that? > > > > On second thought, you are right. Should I move the above function > > somewhere (reloc.c?) , or can I just use it from alternative.c and add > > decl (where?) ? > > Yeah, leave it there for now, perhaps asm/text-patching.h ? The new version boots OK and looks like the attached patch. Uros. [-- Attachment #2: reloc.diff.txt --] [-- Type: text/plain, Size: 3581 bytes --] diff --git a/arch/x86/include/asm/text-patching.h b/arch/x86/include/asm/text-patching.h index 29832c338cdc..ba8d900f3ebe 100644 --- a/arch/x86/include/asm/text-patching.h +++ b/arch/x86/include/asm/text-patching.h @@ -18,6 +18,8 @@ static inline void apply_paravirt(struct paravirt_patch_site *start, #define __parainstructions_end NULL #endif +void apply_relocation(u8 *buf, size_t len, u8 *dest, u8 *src, size_t src_len); + /* * Currently, the max observed size in the kernel code is * JUMP_LABEL_NOP_SIZE/RELATIVEJUMP_SIZE, which are 5. diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 73be3931e4f0..66140c54d4f6 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -325,8 +325,7 @@ bool need_reloc(unsigned long offset, u8 *src, size_t src_len) return (target < src || target > src + src_len); } -static void __init_or_module noinline -apply_relocation(u8 *buf, size_t len, u8 *dest, u8 *src, size_t src_len) +void apply_relocation(u8 *buf, size_t len, u8 *dest, u8 *src, size_t src_len) { int prev, target = 0; diff --git a/arch/x86/kernel/callthunks.c b/arch/x86/kernel/callthunks.c index d0922cf94c90..832eaec36e2b 100644 --- a/arch/x86/kernel/callthunks.c +++ b/arch/x86/kernel/callthunks.c @@ -24,6 +24,8 @@ static int __initdata_or_module debug_callthunks; +#define MAX_PATCH_LEN (255-1) + #define prdbg(fmt, args...) \ do { \ if (debug_callthunks) \ @@ -169,10 +171,15 @@ static const u8 nops[] = { static void *patch_dest(void *dest, bool direct) { unsigned int tsize = SKL_TMPL_SIZE; + u8 insn_buff[MAX_PATCH_LEN]; u8 *pad = dest - tsize; + memcpy(insn_buff, skl_call_thunk_template, tsize); + apply_relocation(insn_buff, tsize, pad, + skl_call_thunk_template, tsize); + /* Already patched? */ - if (!bcmp(pad, skl_call_thunk_template, tsize)) + if (!bcmp(pad, insn_buff, tsize)) return pad; /* Ensure there are nops */ @@ -182,9 +189,9 @@ static void *patch_dest(void *dest, bool direct) } if (direct) - memcpy(pad, skl_call_thunk_template, tsize); + memcpy(pad, insn_buff, tsize); else - text_poke_copy_locked(pad, skl_call_thunk_template, tsize, true); + text_poke_copy_locked(pad, insn_buff, tsize, true); return pad; } @@ -281,20 +288,27 @@ void *callthunks_translate_call_dest(void *dest) static bool is_callthunk(void *addr) { unsigned int tmpl_size = SKL_TMPL_SIZE; - void *tmpl = skl_call_thunk_template; + u8 insn_buff[MAX_PATCH_LEN]; unsigned long dest; + u8 *pad; dest = roundup((unsigned long)addr, CONFIG_FUNCTION_ALIGNMENT); if (!thunks_initialized || skip_addr((void *)dest)) return false; - return !bcmp((void *)(dest - tmpl_size), tmpl, tmpl_size); + *pad = dest - tmpl_size; + + memcpy(insn_buff, skl_call_thunk_template, tmpl_size); + apply_relocation(insn_buff, tmpl_size, pad, + skl_call_thunk_template, tmpl_size); + + return !bcmp(pad, insn_buff, tmpl_size); } int x86_call_depth_emit_accounting(u8 **pprog, void *func) { unsigned int tmpl_size = SKL_TMPL_SIZE; - void *tmpl = skl_call_thunk_template; + u8 insn_buff[MAX_PATCH_LEN]; if (!thunks_initialized) return 0; @@ -303,7 +317,11 @@ int x86_call_depth_emit_accounting(u8 **pprog, void *func) if (func && is_callthunk(func)) return 0; - memcpy(*pprog, tmpl, tmpl_size); + memcpy(insn_buff, skl_call_thunk_template, tmpl_size); + apply_relocation(insn_buff, tmpl_size, *pprog, + skl_call_thunk_template, tmpl_size); + + memcpy(*pprog, insn_buff, tmpl_size); *pprog += tmpl_size; return tmpl_size; } ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] x86/callthunks: Fix and unify call thunks assembly snippets 2023-11-02 11:25 [PATCH 0/3] Fix and unify call thunks assembly snippets Uros Bizjak 2023-11-02 11:25 ` [PATCH 1/3] x86/callthunks: Move call thunk template to .S file Uros Bizjak 2023-11-02 11:25 ` [PATCH 2/3] x86/callthunks: Handle %rip-relative relocations in call thunk template Uros Bizjak @ 2023-11-02 11:25 ` Uros Bizjak 2023-11-05 8:23 ` kernel test robot 2 siblings, 1 reply; 9+ messages in thread From: Uros Bizjak @ 2023-11-02 11:25 UTC (permalink / raw) To: x86, linux-kernel Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra Currently thunk debug macros explicitly define %gs: segment register prefix for their percpu variables. This is not compatible with !CONFIG_SMP, which requires non-prefixed percpu variables. Fix call thunks debug macros to use PER_CPU_VAR macro from percpu.h to conditionally use %gs: segment register prefix, depending on CONFIG_SMP. Finally, unify ASM_ prefixed assembly macros with their non-prefixed variants. With support of %rip-relative relocations in place, call thunk templates allow %rip-relative addressing, so unified assembly snippet can be used everywhere. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> --- arch/x86/include/asm/nospec-branch.h | 23 +++++++---------------- 1 file changed, 7 insertions(+), 16 deletions(-) diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h index f93e9b96927a..6f677be6bdb9 100644 --- a/arch/x86/include/asm/nospec-branch.h +++ b/arch/x86/include/asm/nospec-branch.h @@ -59,13 +59,13 @@ #ifdef CONFIG_CALL_THUNKS_DEBUG # define CALL_THUNKS_DEBUG_INC_CALLS \ - incq %gs:__x86_call_count; + incq PER_CPU_VAR(__x86_call_count); # define CALL_THUNKS_DEBUG_INC_RETS \ - incq %gs:__x86_ret_count; + incq PER_CPU_VAR(__x86_ret_count); # define CALL_THUNKS_DEBUG_INC_STUFFS \ - incq %gs:__x86_stuffs_count; + incq PER_CPU_VAR(__x86_stuffs_count); # define CALL_THUNKS_DEBUG_INC_CTXSW \ - incq %gs:__x86_ctxsw_count; + incq PER_CPU_VAR(__x86_ctxsw_count); #else # define CALL_THUNKS_DEBUG_INC_CALLS # define CALL_THUNKS_DEBUG_INC_RETS @@ -80,9 +80,6 @@ #define CREDIT_CALL_DEPTH \ movq $-1, PER_CPU_VAR(pcpu_hot + X86_call_depth); -#define ASM_CREDIT_CALL_DEPTH \ - movq $-1, PER_CPU_VAR(pcpu_hot + X86_call_depth); - #define RESET_CALL_DEPTH \ xor %eax, %eax; \ bts $63, %rax; \ @@ -95,20 +92,14 @@ CALL_THUNKS_DEBUG_INC_CALLS #define INCREMENT_CALL_DEPTH \ - sarq $5, %gs:pcpu_hot + X86_call_depth; \ - CALL_THUNKS_DEBUG_INC_CALLS - -#define ASM_INCREMENT_CALL_DEPTH \ sarq $5, PER_CPU_VAR(pcpu_hot + X86_call_depth); \ CALL_THUNKS_DEBUG_INC_CALLS #else #define CREDIT_CALL_DEPTH -#define ASM_CREDIT_CALL_DEPTH #define RESET_CALL_DEPTH -#define INCREMENT_CALL_DEPTH -#define ASM_INCREMENT_CALL_DEPTH #define RESET_CALL_DEPTH_FROM_CALL +#define INCREMENT_CALL_DEPTH #endif /* @@ -158,7 +149,7 @@ jnz 771b; \ /* barrier for jnz misprediction */ \ lfence; \ - ASM_CREDIT_CALL_DEPTH \ + CREDIT_CALL_DEPTH \ CALL_THUNKS_DEBUG_INC_CTXSW #else /* @@ -311,7 +302,7 @@ .macro CALL_DEPTH_ACCOUNT #ifdef CONFIG_CALL_DEPTH_TRACKING ALTERNATIVE "", \ - __stringify(ASM_INCREMENT_CALL_DEPTH), X86_FEATURE_CALL_DEPTH + __stringify(INCREMENT_CALL_DEPTH), X86_FEATURE_CALL_DEPTH #endif .endm -- 2.41.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 3/3] x86/callthunks: Fix and unify call thunks assembly snippets 2023-11-02 11:25 ` [PATCH 3/3] x86/callthunks: Fix and unify call thunks assembly snippets Uros Bizjak @ 2023-11-05 8:23 ` kernel test robot 0 siblings, 0 replies; 9+ messages in thread From: kernel test robot @ 2023-11-05 8:23 UTC (permalink / raw) To: Uros Bizjak, x86, linux-kernel Cc: oe-kbuild-all, Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Peter Zijlstra Hi Uros, kernel test robot noticed the following build errors: [auto build test ERROR on tip/x86/core] [also build test ERROR on tip/master tip/auto-latest linus/master v6.6 next-20231103] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Uros-Bizjak/x86-callthunks-Move-call-thunk-template-to-S-file/20231102-193542 base: tip/x86/core patch link: https://lore.kernel.org/r/20231102112850.3448745-4-ubizjak%40gmail.com patch subject: [PATCH 3/3] x86/callthunks: Fix and unify call thunks assembly snippets config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20231105/202311051652.38OyamEq-lkp@intel.com/config) compiler: gcc-12 (Debian 12.2.0-14) 12.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231105/202311051652.38OyamEq-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202311051652.38OyamEq-lkp@intel.com/ All errors (new ones prefixed by >>): /tmp/ccwZh9MG.s: Assembler messages: >> /tmp/ccwZh9MG.s:27: Error: junk `(pcpu_hot+16)' after expression >> /tmp/ccwZh9MG.s:27: Error: junk `(__x86_call_count)' after expression -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-11-05 8:27 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-11-02 11:25 [PATCH 0/3] Fix and unify call thunks assembly snippets Uros Bizjak 2023-11-02 11:25 ` [PATCH 1/3] x86/callthunks: Move call thunk template to .S file Uros Bizjak 2023-11-02 11:25 ` [PATCH 2/3] x86/callthunks: Handle %rip-relative relocations in call thunk template Uros Bizjak 2023-11-02 11:44 ` Peter Zijlstra 2023-11-02 11:50 ` Uros Bizjak 2023-11-02 11:56 ` Peter Zijlstra 2023-11-02 12:34 ` Uros Bizjak 2023-11-02 11:25 ` [PATCH 3/3] x86/callthunks: Fix and unify call thunks assembly snippets Uros Bizjak 2023-11-05 8:23 ` kernel test robot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox