* [PATCH v2 1/4] x86/percpu: Introduce PER_CPU_ARG and use it in cmpxchg{8,16}b_emu.S
2023-10-12 20:12 [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
@ 2023-10-12 20:12 ` Uros Bizjak
2023-10-12 20:12 ` [PATCH v2 2/4] x86/percpu: Correct PER_CPU_VAR usage to include symbol and its addend Uros Bizjak
` (3 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 20:12 UTC (permalink / raw)
To: x86, xen-devel, linux-kernel
Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, Peter Zijlstra
PER_CPU_VAR macro is intended to be applied to a symbol and should not
be used with general operands. Introduce new PER_CPU_ARG macro and
use it in cmpxchg{8,16}b_emu.S instead.
PER_CPU_VAR macro will be repurposed for %rip-relative addressing.
Also add a missing function comment to this_cpu_cmpxchg8b_emu.
No functional changes intended.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
--
v2: Introduce PER_CPU_ARG macro to conditionally enable
segment registers in cmpxchg{8,16}b_emu.S for CONFIG_SMP.
---
arch/x86/include/asm/percpu.h | 2 ++
arch/x86/lib/cmpxchg16b_emu.S | 12 ++++++------
arch/x86/lib/cmpxchg8b_emu.S | 24 ++++++++++++++++++------
3 files changed, 26 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 34734d730463..83e6a4bcea38 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -11,8 +11,10 @@
#ifdef __ASSEMBLY__
#ifdef CONFIG_SMP
+#define PER_CPU_ARG(arg) %__percpu_seg:arg
#define PER_CPU_VAR(var) %__percpu_seg:var
#else /* ! SMP */
+#define PER_CPU_ARG(arg) arg
#define PER_CPU_VAR(var) var
#endif /* SMP */
diff --git a/arch/x86/lib/cmpxchg16b_emu.S b/arch/x86/lib/cmpxchg16b_emu.S
index 6962df315793..b6b942d07a00 100644
--- a/arch/x86/lib/cmpxchg16b_emu.S
+++ b/arch/x86/lib/cmpxchg16b_emu.S
@@ -23,14 +23,14 @@ SYM_FUNC_START(this_cpu_cmpxchg16b_emu)
cli
/* if (*ptr == old) */
- cmpq PER_CPU_VAR(0(%rsi)), %rax
+ cmpq PER_CPU_ARG(0(%rsi)), %rax
jne .Lnot_same
- cmpq PER_CPU_VAR(8(%rsi)), %rdx
+ cmpq PER_CPU_ARG(8(%rsi)), %rdx
jne .Lnot_same
/* *ptr = new */
- movq %rbx, PER_CPU_VAR(0(%rsi))
- movq %rcx, PER_CPU_VAR(8(%rsi))
+ movq %rbx, PER_CPU_ARG(0(%rsi))
+ movq %rcx, PER_CPU_ARG(8(%rsi))
/* set ZF in EFLAGS to indicate success */
orl $X86_EFLAGS_ZF, (%rsp)
@@ -42,8 +42,8 @@ SYM_FUNC_START(this_cpu_cmpxchg16b_emu)
/* *ptr != old */
/* old = *ptr */
- movq PER_CPU_VAR(0(%rsi)), %rax
- movq PER_CPU_VAR(8(%rsi)), %rdx
+ movq PER_CPU_ARG(0(%rsi)), %rax
+ movq PER_CPU_ARG(8(%rsi)), %rdx
/* clear ZF in EFLAGS to indicate failure */
andl $(~X86_EFLAGS_ZF), (%rsp)
diff --git a/arch/x86/lib/cmpxchg8b_emu.S b/arch/x86/lib/cmpxchg8b_emu.S
index 49805257b125..9a0a7feeaf7c 100644
--- a/arch/x86/lib/cmpxchg8b_emu.S
+++ b/arch/x86/lib/cmpxchg8b_emu.S
@@ -53,18 +53,30 @@ EXPORT_SYMBOL(cmpxchg8b_emu)
#ifndef CONFIG_UML
+/*
+ * Emulate 'cmpxchg8b %fs:(%rsi)'
+ *
+ * Inputs:
+ * %esi : memory location to compare
+ * %eax : low 32 bits of old value
+ * %edx : high 32 bits of old value
+ * %ebx : low 32 bits of new value
+ * %ecx : high 32 bits of new value
+ *
+ * Notably this is not LOCK prefixed and is not safe against NMIs
+ */
SYM_FUNC_START(this_cpu_cmpxchg8b_emu)
pushfl
cli
- cmpl PER_CPU_VAR(0(%esi)), %eax
+ cmpl PER_CPU_ARG(0(%esi)), %eax
jne .Lnot_same2
- cmpl PER_CPU_VAR(4(%esi)), %edx
+ cmpl PER_CPU_ARG(4(%esi)), %edx
jne .Lnot_same2
- movl %ebx, PER_CPU_VAR(0(%esi))
- movl %ecx, PER_CPU_VAR(4(%esi))
+ movl %ebx, PER_CPU_ARG(0(%esi))
+ movl %ecx, PER_CPU_ARG(4(%esi))
orl $X86_EFLAGS_ZF, (%esp)
@@ -72,8 +84,8 @@ SYM_FUNC_START(this_cpu_cmpxchg8b_emu)
RET
.Lnot_same2:
- movl PER_CPU_VAR(0(%esi)), %eax
- movl PER_CPU_VAR(4(%esi)), %edx
+ movl PER_CPU_ARG(0(%esi)), %eax
+ movl PER_CPU_ARG(4(%esi)), %edx
andl $(~X86_EFLAGS_ZF), (%esp)
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH v2 2/4] x86/percpu: Correct PER_CPU_VAR usage to include symbol and its addend
2023-10-12 20:12 [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
2023-10-12 20:12 ` [PATCH v2 1/4] x86/percpu: Introduce PER_CPU_ARG and use it in cmpxchg{8,16}b_emu.S Uros Bizjak
@ 2023-10-12 20:12 ` Uros Bizjak
2023-10-12 20:12 ` [PATCH v2 3/4] x86/percpu, xen: " Uros Bizjak
` (2 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 20:12 UTC (permalink / raw)
To: x86, xen-devel, linux-kernel
Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, Peter Zijlstra
PER_CPU_VAR macro should be applied to a symbol and its addend.
Inconsistent usage is currently harmless, but needs to be corrected
before %rip-relative addressing is introduced to PER_CPU_VAR macro.
No functional changes intended.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
arch/x86/entry/calling.h | 2 +-
arch/x86/entry/entry_32.S | 2 +-
arch/x86/entry/entry_64.S | 2 +-
arch/x86/kernel/head_64.S | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index f6907627172b..47368ab0bda0 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -173,7 +173,7 @@ For 32-bit we have the following conventions - kernel is built with
.endm
#define THIS_CPU_user_pcid_flush_mask \
- PER_CPU_VAR(cpu_tlbstate) + TLB_STATE_user_pcid_flush_mask
+ PER_CPU_VAR(cpu_tlbstate + TLB_STATE_user_pcid_flush_mask)
.macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req
ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 6e6af42e044a..d4e094b2c877 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -305,7 +305,7 @@
.macro CHECK_AND_APPLY_ESPFIX
#ifdef CONFIG_X86_ESPFIX32
#define GDT_ESPFIX_OFFSET (GDT_ENTRY_ESPFIX_SS * 8)
-#define GDT_ESPFIX_SS PER_CPU_VAR(gdt_page) + GDT_ESPFIX_OFFSET
+#define GDT_ESPFIX_SS PER_CPU_VAR(gdt_page + GDT_ESPFIX_OFFSET)
ALTERNATIVE "jmp .Lend_\@", "", X86_BUG_ESPFIX
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 43606de22511..3d6770b87b87 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -252,7 +252,7 @@ SYM_FUNC_START(__switch_to_asm)
#ifdef CONFIG_STACKPROTECTOR
movq TASK_stack_canary(%rsi), %rbx
- movq %rbx, PER_CPU_VAR(fixed_percpu_data) + FIXED_stack_canary
+ movq %rbx, PER_CPU_VAR(fixed_percpu_data + FIXED_stack_canary)
#endif
/*
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index ea6995920b7a..bfe5ec2f4f83 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -449,7 +449,7 @@ SYM_CODE_START(soft_restart_cpu)
UNWIND_HINT_END_OF_STACK
/* Find the idle task stack */
- movq PER_CPU_VAR(pcpu_hot) + X86_current_task, %rcx
+ movq PER_CPU_VAR(pcpu_hot + X86_current_task), %rcx
movq TASK_threadsp(%rcx), %rsp
jmp .Ljump_to_C_code
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH v2 3/4] x86/percpu, xen: Correct PER_CPU_VAR usage to include symbol and its addend
2023-10-12 20:12 [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
2023-10-12 20:12 ` [PATCH v2 1/4] x86/percpu: Introduce PER_CPU_ARG and use it in cmpxchg{8,16}b_emu.S Uros Bizjak
2023-10-12 20:12 ` [PATCH v2 2/4] x86/percpu: Correct PER_CPU_VAR usage to include symbol and its addend Uros Bizjak
@ 2023-10-12 20:12 ` Uros Bizjak
2023-10-12 20:12 ` [PATCH v2 4/4] x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
2023-10-12 20:53 ` [PATCH v2 0/4] " Dave Hansen
4 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 20:12 UTC (permalink / raw)
To: x86, xen-devel, linux-kernel
Cc: Uros Bizjak, Juergen Gross, Boris Ostrovsky, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin
PER_CPU_VAR macro should be applied to a symbol and its addend.
Inconsisten usage is currently harmless, but needs to be corrected
before %rip-relative addressing is introduced to PER_CPU_VAR macro.
No functional changes intended.
Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
arch/x86/xen/xen-asm.S | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index 9e5e68008785..448958ddbaf8 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -28,7 +28,7 @@
* non-zero.
*/
SYM_FUNC_START(xen_irq_disable_direct)
- movb $1, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+ movb $1, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
RET
SYM_FUNC_END(xen_irq_disable_direct)
@@ -69,7 +69,7 @@ SYM_FUNC_END(check_events)
SYM_FUNC_START(xen_irq_enable_direct)
FRAME_BEGIN
/* Unmask events */
- movb $0, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+ movb $0, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
/*
* Preempt here doesn't matter because that will deal with any
@@ -78,7 +78,7 @@ SYM_FUNC_START(xen_irq_enable_direct)
*/
/* Test for pending */
- testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_pending
+ testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_pending)
jz 1f
call check_events
@@ -97,7 +97,7 @@ SYM_FUNC_END(xen_irq_enable_direct)
* x86 use opposite senses (mask vs enable).
*/
SYM_FUNC_START(xen_save_fl_direct)
- testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+ testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
setz %ah
addb %ah, %ah
RET
@@ -113,7 +113,7 @@ SYM_FUNC_END(xen_read_cr2);
SYM_FUNC_START(xen_read_cr2_direct)
FRAME_BEGIN
- _ASM_MOV PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_arch_cr2, %_ASM_AX
+ _ASM_MOV PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_arch_cr2), %_ASM_AX
FRAME_END
RET
SYM_FUNC_END(xen_read_cr2_direct);
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* [PATCH v2 4/4] x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR macro
2023-10-12 20:12 [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
` (2 preceding siblings ...)
2023-10-12 20:12 ` [PATCH v2 3/4] x86/percpu, xen: " Uros Bizjak
@ 2023-10-12 20:12 ` Uros Bizjak
2023-10-12 20:53 ` [PATCH v2 0/4] " Dave Hansen
4 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 20:12 UTC (permalink / raw)
To: x86, xen-devel, linux-kernel
Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, Peter Zijlstra
Introduce x86_64 %rip-relative addressing to PER_CPU_VAR macro.
Instruction with %rip-relative address operand is one byte shorter than
its absolute address counterpart and is also compatible with position
independent executable (-fpie) build.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
arch/x86/include/asm/percpu.h | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 83e6a4bcea38..c53c5a7f8e78 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -4,19 +4,21 @@
#ifdef CONFIG_X86_64
#define __percpu_seg gs
+#define __percpu_rel (%rip)
#else
#define __percpu_seg fs
+#define __percpu_rel
#endif
#ifdef __ASSEMBLY__
#ifdef CONFIG_SMP
#define PER_CPU_ARG(arg) %__percpu_seg:arg
-#define PER_CPU_VAR(var) %__percpu_seg:var
+#define PER_CPU_VAR(var) %__percpu_seg:(var)##__percpu_rel
#else /* ! SMP */
#define PER_CPU_ARG(arg) arg
-#define PER_CPU_VAR(var) var
-#endif /* SMP */
+#define PER_CPU_VAR(var) (var)##__percpu_rel
+#endif /* SMP */
#ifdef CONFIG_X86_64_SMP
#define INIT_PER_CPU_VAR(var) init_per_cpu__##var
--
2.41.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
2023-10-12 20:12 [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
` (3 preceding siblings ...)
2023-10-12 20:12 ` [PATCH v2 4/4] x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
@ 2023-10-12 20:53 ` Dave Hansen
2023-10-12 20:59 ` Uros Bizjak
4 siblings, 1 reply; 11+ messages in thread
From: Dave Hansen @ 2023-10-12 20:53 UTC (permalink / raw)
To: Uros Bizjak, x86, xen-devel, linux-kernel
Cc: Juergen Gross, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, H. Peter Anvin
On 10/12/23 13:12, Uros Bizjak wrote:
> The last patch introduces (%rip) suffix and uses it for x86_64 target,
> resulting in a small code size decrease: text data bss dec hex filename
> 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
> 808388 30705702 1d48826 vmlinux-old.o
I feel like I'm missing some of the motivation here.
50 bytes is great and all, but it isn't without the cost of changing
some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
confusion.
Are there some other side benefits? What else does this enable?
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
2023-10-12 20:53 ` [PATCH v2 0/4] " Dave Hansen
@ 2023-10-12 20:59 ` Uros Bizjak
2023-10-12 21:08 ` H. Peter Anvin
0 siblings, 1 reply; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 20:59 UTC (permalink / raw)
To: Dave Hansen
Cc: x86, xen-devel, linux-kernel, Juergen Gross, Boris Ostrovsky,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
H. Peter Anvin
On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 10/12/23 13:12, Uros Bizjak wrote:
> > The last patch introduces (%rip) suffix and uses it for x86_64 target,
> > resulting in a small code size decrease: text data bss dec hex filename
> > 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
> > 808388 30705702 1d48826 vmlinux-old.o
>
> I feel like I'm missing some of the motivation here.
>
> 50 bytes is great and all, but it isn't without the cost of changing
> some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
> confusion.
>
> Are there some other side benefits? What else does this enable?
These changes are necessary to build the kernel as Position
Independent Executable (PIE) on x86_64 [1]. And since I was working in
percpu area I thought that it was worth implementing them.
[1] https://lore.kernel.org/lkml/cover.1682673542.git.houwenlong.hwl@antgroup.com/
Uros.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
2023-10-12 20:59 ` Uros Bizjak
@ 2023-10-12 21:08 ` H. Peter Anvin
2023-10-12 21:17 ` Uros Bizjak
0 siblings, 1 reply; 11+ messages in thread
From: H. Peter Anvin @ 2023-10-12 21:08 UTC (permalink / raw)
To: Uros Bizjak, Dave Hansen
Cc: x86, xen-devel, linux-kernel, Juergen Gross, Boris Ostrovsky,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen
On 10/12/23 13:59, Uros Bizjak wrote:
> On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <dave.hansen@intel.com> wrote:
>>
>> On 10/12/23 13:12, Uros Bizjak wrote:
>>> The last patch introduces (%rip) suffix and uses it for x86_64 target,
>>> resulting in a small code size decrease: text data bss dec hex filename
>>> 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
>>> 808388 30705702 1d48826 vmlinux-old.o
>>
>> I feel like I'm missing some of the motivation here.
>>
>> 50 bytes is great and all, but it isn't without the cost of changing
>> some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
>> confusion.
>>
>> Are there some other side benefits? What else does this enable?
>
> These changes are necessary to build the kernel as Position
> Independent Executable (PIE) on x86_64 [1]. And since I was working in
> percpu area I thought that it was worth implementing them.
>
> [1] https://lore.kernel.org/lkml/cover.1682673542.git.houwenlong.hwl@antgroup.com/
>
Are you PIC-adjusting the percpu variables as well?
-hpa
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
2023-10-12 21:08 ` H. Peter Anvin
@ 2023-10-12 21:17 ` Uros Bizjak
2023-10-12 21:21 ` H. Peter Anvin
0 siblings, 1 reply; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 21:17 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Dave Hansen, x86, xen-devel, linux-kernel, Juergen Gross,
Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen
On Thu, Oct 12, 2023 at 11:08 PM H. Peter Anvin <hpa@zytor.com> wrote:
>
> On 10/12/23 13:59, Uros Bizjak wrote:
> > On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <dave.hansen@intel.com> wrote:
> >>
> >> On 10/12/23 13:12, Uros Bizjak wrote:
> >>> The last patch introduces (%rip) suffix and uses it for x86_64 target,
> >>> resulting in a small code size decrease: text data bss dec hex filename
> >>> 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
> >>> 808388 30705702 1d48826 vmlinux-old.o
> >>
> >> I feel like I'm missing some of the motivation here.
> >>
> >> 50 bytes is great and all, but it isn't without the cost of changing
> >> some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
> >> confusion.
> >>
> >> Are there some other side benefits? What else does this enable?
> >
> > These changes are necessary to build the kernel as Position
> > Independent Executable (PIE) on x86_64 [1]. And since I was working in
> > percpu area I thought that it was worth implementing them.
> >
> > [1] https://lore.kernel.org/lkml/cover.1682673542.git.houwenlong.hwl@antgroup.com/
> >
>
> Are you PIC-adjusting the percpu variables as well?
After this patch (and after fixing percpu_stable_op to use "a" operand
modifier on GCC), the only *one* remaining absolute reference to
percpu variable remain in xen-head.S, where:
movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax
should be changed to use leaq.
All others should then be (%rip)-relative.
Uros.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
2023-10-12 21:17 ` Uros Bizjak
@ 2023-10-12 21:21 ` H. Peter Anvin
2023-10-12 22:44 ` Uros Bizjak
0 siblings, 1 reply; 11+ messages in thread
From: H. Peter Anvin @ 2023-10-12 21:21 UTC (permalink / raw)
To: Uros Bizjak
Cc: Dave Hansen, x86, xen-devel, linux-kernel, Juergen Gross,
Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen
On 10/12/23 14:17, Uros Bizjak wrote:
>>
>> Are you PIC-adjusting the percpu variables as well?
>
> After this patch (and after fixing percpu_stable_op to use "a" operand
> modifier on GCC), the only *one* remaining absolute reference to
> percpu variable remain in xen-head.S, where:
>
> movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax
>
> should be changed to use leaq.
>
> All others should then be (%rip)-relative.
>
I mean, the symbols themselves are relative, not absolute?
-hpa
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
2023-10-12 21:21 ` H. Peter Anvin
@ 2023-10-12 22:44 ` Uros Bizjak
0 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 22:44 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Dave Hansen, x86, xen-devel, linux-kernel, Juergen Gross,
Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen
On Thu, Oct 12, 2023 at 11:22 PM H. Peter Anvin <hpa@zytor.com> wrote:
>
> On 10/12/23 14:17, Uros Bizjak wrote:
> >>
> >> Are you PIC-adjusting the percpu variables as well?
> >
> > After this patch (and after fixing percpu_stable_op to use "a" operand
> > modifier on GCC), the only *one* remaining absolute reference to
> > percpu variable remain in xen-head.S, where:
> >
> > movq $INIT_PER_CPU_VAR(fixed_percpu_data),%rax
> >
> > should be changed to use leaq.
> >
> > All others should then be (%rip)-relative.
> >
>
> I mean, the symbols themselves are relative, not absolute?
The reference to the symbol is relative to the segment register, but
absolute to the location of the instruction. If the executable changes
location, then instruction moves around and reference is not valid
anymore. (%rip)-relative reference compensate for changed location of
the instruction.
Uros.
^ permalink raw reply [flat|nested] 11+ messages in thread