public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4]  Introduce %rip-relative addressing to PER_CPU_VAR macro
@ 2023-10-12 20:12 Uros Bizjak
  2023-10-12 20:12 ` [PATCH v2 1/4] x86/percpu: Introduce PER_CPU_ARG and use it in cmpxchg{8,16}b_emu.S Uros Bizjak
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 20:12 UTC (permalink / raw)
  To: x86, xen-devel, linux-kernel
  Cc: Uros Bizjak, Juergen Gross, Boris Ostrovsky, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin

The following patch series introduces %rip-relative addressing to
PER_CPU_VAR macro. Instruction with %rip-relative address operand is
one byte shorter than its absolute address counterpart and is also
compatible with position independent executable (-fpie) build.

The first three patches are cleanups that fix various inconsistencies
throughout the assembly code.

The last patch introduces (%rip) suffix and uses it for x86_64 target,
resulting in a small code size decrease:

   text    data     bss     dec     hex filename
25510677        4386685  808388 30705750        1d48856 vmlinux-new.o
25510629        4386685  808388 30705702        1d48826 vmlinux-old.o

Patch series is against current mainline and can be applied independently
of ongoing percpu work.

v2: Introduce PER_CPU_ARG macro to conditionally enable
    segment registers in cmpxchg{8,16}b_emu.S for CONFIG_SMP.

Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>

Uros Bizjak (4):
  x86/percpu: Introduce PER_CPU_ARG and use it in cmpxchg{8,16}b_emu.S
  x86/percpu: Correct PER_CPU_VAR usage to include symbol and its addend
  x86/percpu, xen: Correct PER_CPU_VAR usage to include symbol and its
    addend
  x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR macro

 arch/x86/entry/calling.h      |  2 +-
 arch/x86/entry/entry_32.S     |  2 +-
 arch/x86/entry/entry_64.S     |  2 +-
 arch/x86/include/asm/percpu.h | 10 +++++++---
 arch/x86/kernel/head_64.S     |  2 +-
 arch/x86/lib/cmpxchg16b_emu.S | 12 ++++++------
 arch/x86/lib/cmpxchg8b_emu.S  | 24 ++++++++++++++++++------
 arch/x86/xen/xen-asm.S        | 10 +++++-----
 8 files changed, 40 insertions(+), 24 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/4] x86/percpu: Introduce PER_CPU_ARG and use it in cmpxchg{8,16}b_emu.S
  2023-10-12 20:12 [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
@ 2023-10-12 20:12 ` Uros Bizjak
  2023-10-12 20:12 ` [PATCH v2 2/4] x86/percpu: Correct PER_CPU_VAR usage to include symbol and its addend Uros Bizjak
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 20:12 UTC (permalink / raw)
  To: x86, xen-devel, linux-kernel
  Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Peter Zijlstra

PER_CPU_VAR macro is intended to be applied to a symbol and should not
be used with general operands. Introduce new PER_CPU_ARG macro and
use it in cmpxchg{8,16}b_emu.S instead.

PER_CPU_VAR macro will be repurposed for %rip-relative addressing.

Also add a missing function comment to this_cpu_cmpxchg8b_emu.

No functional changes intended.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
--
v2: Introduce PER_CPU_ARG macro to conditionally enable
    segment registers in cmpxchg{8,16}b_emu.S for CONFIG_SMP.
---
 arch/x86/include/asm/percpu.h |  2 ++
 arch/x86/lib/cmpxchg16b_emu.S | 12 ++++++------
 arch/x86/lib/cmpxchg8b_emu.S  | 24 ++++++++++++++++++------
 3 files changed, 26 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 34734d730463..83e6a4bcea38 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -11,8 +11,10 @@
 #ifdef __ASSEMBLY__
 
 #ifdef CONFIG_SMP
+#define PER_CPU_ARG(arg)	%__percpu_seg:arg
 #define PER_CPU_VAR(var)	%__percpu_seg:var
 #else /* ! SMP */
+#define PER_CPU_ARG(arg)	arg
 #define PER_CPU_VAR(var)	var
 #endif	/* SMP */
 
diff --git a/arch/x86/lib/cmpxchg16b_emu.S b/arch/x86/lib/cmpxchg16b_emu.S
index 6962df315793..b6b942d07a00 100644
--- a/arch/x86/lib/cmpxchg16b_emu.S
+++ b/arch/x86/lib/cmpxchg16b_emu.S
@@ -23,14 +23,14 @@ SYM_FUNC_START(this_cpu_cmpxchg16b_emu)
 	cli
 
 	/* if (*ptr == old) */
-	cmpq	PER_CPU_VAR(0(%rsi)), %rax
+	cmpq	PER_CPU_ARG(0(%rsi)), %rax
 	jne	.Lnot_same
-	cmpq	PER_CPU_VAR(8(%rsi)), %rdx
+	cmpq	PER_CPU_ARG(8(%rsi)), %rdx
 	jne	.Lnot_same
 
 	/* *ptr = new */
-	movq	%rbx, PER_CPU_VAR(0(%rsi))
-	movq	%rcx, PER_CPU_VAR(8(%rsi))
+	movq	%rbx, PER_CPU_ARG(0(%rsi))
+	movq	%rcx, PER_CPU_ARG(8(%rsi))
 
 	/* set ZF in EFLAGS to indicate success */
 	orl	$X86_EFLAGS_ZF, (%rsp)
@@ -42,8 +42,8 @@ SYM_FUNC_START(this_cpu_cmpxchg16b_emu)
 	/* *ptr != old */
 
 	/* old = *ptr */
-	movq	PER_CPU_VAR(0(%rsi)), %rax
-	movq	PER_CPU_VAR(8(%rsi)), %rdx
+	movq	PER_CPU_ARG(0(%rsi)), %rax
+	movq	PER_CPU_ARG(8(%rsi)), %rdx
 
 	/* clear ZF in EFLAGS to indicate failure */
 	andl	$(~X86_EFLAGS_ZF), (%rsp)
diff --git a/arch/x86/lib/cmpxchg8b_emu.S b/arch/x86/lib/cmpxchg8b_emu.S
index 49805257b125..9a0a7feeaf7c 100644
--- a/arch/x86/lib/cmpxchg8b_emu.S
+++ b/arch/x86/lib/cmpxchg8b_emu.S
@@ -53,18 +53,30 @@ EXPORT_SYMBOL(cmpxchg8b_emu)
 
 #ifndef CONFIG_UML
 
+/*
+ * Emulate 'cmpxchg8b %fs:(%rsi)'
+ *
+ * Inputs:
+ * %esi : memory location to compare
+ * %eax : low 32 bits of old value
+ * %edx : high 32 bits of old value
+ * %ebx : low 32 bits of new value
+ * %ecx : high 32 bits of new value
+ *
+ * Notably this is not LOCK prefixed and is not safe against NMIs
+ */
 SYM_FUNC_START(this_cpu_cmpxchg8b_emu)
 
 	pushfl
 	cli
 
-	cmpl	PER_CPU_VAR(0(%esi)), %eax
+	cmpl	PER_CPU_ARG(0(%esi)), %eax
 	jne	.Lnot_same2
-	cmpl	PER_CPU_VAR(4(%esi)), %edx
+	cmpl	PER_CPU_ARG(4(%esi)), %edx
 	jne	.Lnot_same2
 
-	movl	%ebx, PER_CPU_VAR(0(%esi))
-	movl	%ecx, PER_CPU_VAR(4(%esi))
+	movl	%ebx, PER_CPU_ARG(0(%esi))
+	movl	%ecx, PER_CPU_ARG(4(%esi))
 
 	orl	$X86_EFLAGS_ZF, (%esp)
 
@@ -72,8 +84,8 @@ SYM_FUNC_START(this_cpu_cmpxchg8b_emu)
 	RET
 
 .Lnot_same2:
-	movl	PER_CPU_VAR(0(%esi)), %eax
-	movl	PER_CPU_VAR(4(%esi)), %edx
+	movl	PER_CPU_ARG(0(%esi)), %eax
+	movl	PER_CPU_ARG(4(%esi)), %edx
 
 	andl	$(~X86_EFLAGS_ZF), (%esp)
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/4] x86/percpu: Correct PER_CPU_VAR usage to include symbol and its addend
  2023-10-12 20:12 [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
  2023-10-12 20:12 ` [PATCH v2 1/4] x86/percpu: Introduce PER_CPU_ARG and use it in cmpxchg{8,16}b_emu.S Uros Bizjak
@ 2023-10-12 20:12 ` Uros Bizjak
  2023-10-12 20:12 ` [PATCH v2 3/4] x86/percpu, xen: " Uros Bizjak
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 20:12 UTC (permalink / raw)
  To: x86, xen-devel, linux-kernel
  Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Peter Zijlstra

PER_CPU_VAR macro should be applied to a symbol and its addend.
Inconsistent usage is currently harmless, but needs to be corrected
before %rip-relative addressing is introduced to PER_CPU_VAR macro.

No functional changes intended.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/x86/entry/calling.h  | 2 +-
 arch/x86/entry/entry_32.S | 2 +-
 arch/x86/entry/entry_64.S | 2 +-
 arch/x86/kernel/head_64.S | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index f6907627172b..47368ab0bda0 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -173,7 +173,7 @@ For 32-bit we have the following conventions - kernel is built with
 .endm
 
 #define THIS_CPU_user_pcid_flush_mask   \
-	PER_CPU_VAR(cpu_tlbstate) + TLB_STATE_user_pcid_flush_mask
+	PER_CPU_VAR(cpu_tlbstate + TLB_STATE_user_pcid_flush_mask)
 
 .macro SWITCH_TO_USER_CR3_NOSTACK scratch_reg:req scratch_reg2:req
 	ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 6e6af42e044a..d4e094b2c877 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -305,7 +305,7 @@
 .macro CHECK_AND_APPLY_ESPFIX
 #ifdef CONFIG_X86_ESPFIX32
 #define GDT_ESPFIX_OFFSET (GDT_ENTRY_ESPFIX_SS * 8)
-#define GDT_ESPFIX_SS PER_CPU_VAR(gdt_page) + GDT_ESPFIX_OFFSET
+#define GDT_ESPFIX_SS PER_CPU_VAR(gdt_page + GDT_ESPFIX_OFFSET)
 
 	ALTERNATIVE	"jmp .Lend_\@", "", X86_BUG_ESPFIX
 
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 43606de22511..3d6770b87b87 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -252,7 +252,7 @@ SYM_FUNC_START(__switch_to_asm)
 
 #ifdef CONFIG_STACKPROTECTOR
 	movq	TASK_stack_canary(%rsi), %rbx
-	movq	%rbx, PER_CPU_VAR(fixed_percpu_data) + FIXED_stack_canary
+	movq	%rbx, PER_CPU_VAR(fixed_percpu_data + FIXED_stack_canary)
 #endif
 
 	/*
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index ea6995920b7a..bfe5ec2f4f83 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -449,7 +449,7 @@ SYM_CODE_START(soft_restart_cpu)
 	UNWIND_HINT_END_OF_STACK
 
 	/* Find the idle task stack */
-	movq	PER_CPU_VAR(pcpu_hot) + X86_current_task, %rcx
+	movq	PER_CPU_VAR(pcpu_hot + X86_current_task), %rcx
 	movq	TASK_threadsp(%rcx), %rsp
 
 	jmp	.Ljump_to_C_code
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 3/4] x86/percpu, xen: Correct PER_CPU_VAR usage to include symbol and its addend
  2023-10-12 20:12 [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
  2023-10-12 20:12 ` [PATCH v2 1/4] x86/percpu: Introduce PER_CPU_ARG and use it in cmpxchg{8,16}b_emu.S Uros Bizjak
  2023-10-12 20:12 ` [PATCH v2 2/4] x86/percpu: Correct PER_CPU_VAR usage to include symbol and its addend Uros Bizjak
@ 2023-10-12 20:12 ` Uros Bizjak
  2023-10-12 20:12 ` [PATCH v2 4/4] x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
  2023-10-12 20:53 ` [PATCH v2 0/4] " Dave Hansen
  4 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 20:12 UTC (permalink / raw)
  To: x86, xen-devel, linux-kernel
  Cc: Uros Bizjak, Juergen Gross, Boris Ostrovsky, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin

PER_CPU_VAR macro should be applied to a symbol and its addend.
Inconsisten usage is currently harmless, but needs to be corrected
before %rip-relative addressing is introduced to PER_CPU_VAR macro.

No functional changes intended.

Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/x86/xen/xen-asm.S | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index 9e5e68008785..448958ddbaf8 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -28,7 +28,7 @@
  * non-zero.
  */
 SYM_FUNC_START(xen_irq_disable_direct)
-	movb $1, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+	movb $1, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
 	RET
 SYM_FUNC_END(xen_irq_disable_direct)
 
@@ -69,7 +69,7 @@ SYM_FUNC_END(check_events)
 SYM_FUNC_START(xen_irq_enable_direct)
 	FRAME_BEGIN
 	/* Unmask events */
-	movb $0, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+	movb $0, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
 
 	/*
 	 * Preempt here doesn't matter because that will deal with any
@@ -78,7 +78,7 @@ SYM_FUNC_START(xen_irq_enable_direct)
 	 */
 
 	/* Test for pending */
-	testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_pending
+	testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_pending)
 	jz 1f
 
 	call check_events
@@ -97,7 +97,7 @@ SYM_FUNC_END(xen_irq_enable_direct)
  * x86 use opposite senses (mask vs enable).
  */
 SYM_FUNC_START(xen_save_fl_direct)
-	testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
+	testb $0xff, PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_mask)
 	setz %ah
 	addb %ah, %ah
 	RET
@@ -113,7 +113,7 @@ SYM_FUNC_END(xen_read_cr2);
 
 SYM_FUNC_START(xen_read_cr2_direct)
 	FRAME_BEGIN
-	_ASM_MOV PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_arch_cr2, %_ASM_AX
+	_ASM_MOV PER_CPU_VAR(xen_vcpu_info + XEN_vcpu_info_arch_cr2), %_ASM_AX
 	FRAME_END
 	RET
 SYM_FUNC_END(xen_read_cr2_direct);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 4/4] x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR macro
  2023-10-12 20:12 [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
                   ` (2 preceding siblings ...)
  2023-10-12 20:12 ` [PATCH v2 3/4] x86/percpu, xen: " Uros Bizjak
@ 2023-10-12 20:12 ` Uros Bizjak
  2023-10-12 20:53 ` [PATCH v2 0/4] " Dave Hansen
  4 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 20:12 UTC (permalink / raw)
  To: x86, xen-devel, linux-kernel
  Cc: Uros Bizjak, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Peter Zijlstra

Introduce x86_64 %rip-relative addressing to PER_CPU_VAR macro.
Instruction with %rip-relative address operand is one byte shorter than
its absolute address counterpart and is also compatible with position
independent executable (-fpie) build.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/x86/include/asm/percpu.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h
index 83e6a4bcea38..c53c5a7f8e78 100644
--- a/arch/x86/include/asm/percpu.h
+++ b/arch/x86/include/asm/percpu.h
@@ -4,19 +4,21 @@
 
 #ifdef CONFIG_X86_64
 #define __percpu_seg		gs
+#define __percpu_rel		(%rip)
 #else
 #define __percpu_seg		fs
+#define __percpu_rel
 #endif
 
 #ifdef __ASSEMBLY__
 
 #ifdef CONFIG_SMP
 #define PER_CPU_ARG(arg)	%__percpu_seg:arg
-#define PER_CPU_VAR(var)	%__percpu_seg:var
+#define PER_CPU_VAR(var)	%__percpu_seg:(var)##__percpu_rel
 #else /* ! SMP */
 #define PER_CPU_ARG(arg)	arg
-#define PER_CPU_VAR(var)	var
-#endif	/* SMP */
+#define PER_CPU_VAR(var)	(var)##__percpu_rel
+#endif /* SMP */
 
 #ifdef CONFIG_X86_64_SMP
 #define INIT_PER_CPU_VAR(var)  init_per_cpu__##var
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
  2023-10-12 20:12 [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
                   ` (3 preceding siblings ...)
  2023-10-12 20:12 ` [PATCH v2 4/4] x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
@ 2023-10-12 20:53 ` Dave Hansen
  2023-10-12 20:59   ` Uros Bizjak
  4 siblings, 1 reply; 11+ messages in thread
From: Dave Hansen @ 2023-10-12 20:53 UTC (permalink / raw)
  To: Uros Bizjak, x86, xen-devel, linux-kernel
  Cc: Juergen Gross, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin

On 10/12/23 13:12, Uros Bizjak wrote:
> The last patch introduces (%rip) suffix and uses it for x86_64 target,
> resulting in a small code size decrease: text data bss dec hex filename
> 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
> 808388 30705702 1d48826 vmlinux-old.o

I feel like I'm missing some of the motivation here.

50 bytes is great and all, but it isn't without the cost of changing
some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
confusion.

Are there some other side benefits?  What else does this enable?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
  2023-10-12 20:53 ` [PATCH v2 0/4] " Dave Hansen
@ 2023-10-12 20:59   ` Uros Bizjak
  2023-10-12 21:08     ` H. Peter Anvin
  0 siblings, 1 reply; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 20:59 UTC (permalink / raw)
  To: Dave Hansen
  Cc: x86, xen-devel, linux-kernel, Juergen Gross, Boris Ostrovsky,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin

On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 10/12/23 13:12, Uros Bizjak wrote:
> > The last patch introduces (%rip) suffix and uses it for x86_64 target,
> > resulting in a small code size decrease: text data bss dec hex filename
> > 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
> > 808388 30705702 1d48826 vmlinux-old.o
>
> I feel like I'm missing some of the motivation here.
>
> 50 bytes is great and all, but it isn't without the cost of changing
> some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
> confusion.
>
> Are there some other side benefits?  What else does this enable?

These changes are necessary to build the kernel as Position
Independent Executable (PIE) on x86_64 [1]. And since I was working in
percpu area I thought that it was worth implementing them.

[1] https://lore.kernel.org/lkml/cover.1682673542.git.houwenlong.hwl@antgroup.com/

Uros.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
  2023-10-12 20:59   ` Uros Bizjak
@ 2023-10-12 21:08     ` H. Peter Anvin
  2023-10-12 21:17       ` Uros Bizjak
  0 siblings, 1 reply; 11+ messages in thread
From: H. Peter Anvin @ 2023-10-12 21:08 UTC (permalink / raw)
  To: Uros Bizjak, Dave Hansen
  Cc: x86, xen-devel, linux-kernel, Juergen Gross, Boris Ostrovsky,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen

On 10/12/23 13:59, Uros Bizjak wrote:
> On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <dave.hansen@intel.com> wrote:
>>
>> On 10/12/23 13:12, Uros Bizjak wrote:
>>> The last patch introduces (%rip) suffix and uses it for x86_64 target,
>>> resulting in a small code size decrease: text data bss dec hex filename
>>> 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
>>> 808388 30705702 1d48826 vmlinux-old.o
>>
>> I feel like I'm missing some of the motivation here.
>>
>> 50 bytes is great and all, but it isn't without the cost of changing
>> some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
>> confusion.
>>
>> Are there some other side benefits?  What else does this enable?
> 
> These changes are necessary to build the kernel as Position
> Independent Executable (PIE) on x86_64 [1]. And since I was working in
> percpu area I thought that it was worth implementing them.
> 
> [1] https://lore.kernel.org/lkml/cover.1682673542.git.houwenlong.hwl@antgroup.com/
> 

Are you PIC-adjusting the percpu variables as well?

	-hpa

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
  2023-10-12 21:08     ` H. Peter Anvin
@ 2023-10-12 21:17       ` Uros Bizjak
  2023-10-12 21:21         ` H. Peter Anvin
  0 siblings, 1 reply; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 21:17 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dave Hansen, x86, xen-devel, linux-kernel, Juergen Gross,
	Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen

On Thu, Oct 12, 2023 at 11:08 PM H. Peter Anvin <hpa@zytor.com> wrote:
>
> On 10/12/23 13:59, Uros Bizjak wrote:
> > On Thu, Oct 12, 2023 at 10:53 PM Dave Hansen <dave.hansen@intel.com> wrote:
> >>
> >> On 10/12/23 13:12, Uros Bizjak wrote:
> >>> The last patch introduces (%rip) suffix and uses it for x86_64 target,
> >>> resulting in a small code size decrease: text data bss dec hex filename
> >>> 25510677 4386685 808388 30705750 1d48856 vmlinux-new.o 25510629 4386685
> >>> 808388 30705702 1d48826 vmlinux-old.o
> >>
> >> I feel like I'm missing some of the motivation here.
> >>
> >> 50 bytes is great and all, but it isn't without the cost of changing
> >> some rules and introducing potential PER_CPU_ARG() vs. PER_CPU_VAR()
> >> confusion.
> >>
> >> Are there some other side benefits?  What else does this enable?
> >
> > These changes are necessary to build the kernel as Position
> > Independent Executable (PIE) on x86_64 [1]. And since I was working in
> > percpu area I thought that it was worth implementing them.
> >
> > [1] https://lore.kernel.org/lkml/cover.1682673542.git.houwenlong.hwl@antgroup.com/
> >
>
> Are you PIC-adjusting the percpu variables as well?

After this patch (and after fixing percpu_stable_op to use "a" operand
modifier on GCC), the only *one* remaining absolute reference to
percpu variable remain in xen-head.S, where:

    movq    $INIT_PER_CPU_VAR(fixed_percpu_data),%rax

should be changed to use leaq.

All others should then be (%rip)-relative.

Uros.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
  2023-10-12 21:17       ` Uros Bizjak
@ 2023-10-12 21:21         ` H. Peter Anvin
  2023-10-12 22:44           ` Uros Bizjak
  0 siblings, 1 reply; 11+ messages in thread
From: H. Peter Anvin @ 2023-10-12 21:21 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: Dave Hansen, x86, xen-devel, linux-kernel, Juergen Gross,
	Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen

On 10/12/23 14:17, Uros Bizjak wrote:
>>
>> Are you PIC-adjusting the percpu variables as well?
> 
> After this patch (and after fixing percpu_stable_op to use "a" operand
> modifier on GCC), the only *one* remaining absolute reference to
> percpu variable remain in xen-head.S, where:
> 
>      movq    $INIT_PER_CPU_VAR(fixed_percpu_data),%rax
> 
> should be changed to use leaq.
> 
> All others should then be (%rip)-relative.
> 

I mean, the symbols themselves are relative, not absolute?

	-hpa


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro
  2023-10-12 21:21         ` H. Peter Anvin
@ 2023-10-12 22:44           ` Uros Bizjak
  0 siblings, 0 replies; 11+ messages in thread
From: Uros Bizjak @ 2023-10-12 22:44 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dave Hansen, x86, xen-devel, linux-kernel, Juergen Gross,
	Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen

On Thu, Oct 12, 2023 at 11:22 PM H. Peter Anvin <hpa@zytor.com> wrote:
>
> On 10/12/23 14:17, Uros Bizjak wrote:
> >>
> >> Are you PIC-adjusting the percpu variables as well?
> >
> > After this patch (and after fixing percpu_stable_op to use "a" operand
> > modifier on GCC), the only *one* remaining absolute reference to
> > percpu variable remain in xen-head.S, where:
> >
> >      movq    $INIT_PER_CPU_VAR(fixed_percpu_data),%rax
> >
> > should be changed to use leaq.
> >
> > All others should then be (%rip)-relative.
> >
>
> I mean, the symbols themselves are relative, not absolute?

The reference to the symbol is relative to the segment register, but
absolute to the location of the instruction. If the executable changes
location, then instruction moves around  and reference is not valid
anymore. (%rip)-relative reference compensate for changed location of
the instruction.

Uros.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-10-12 22:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-12 20:12 [PATCH v2 0/4] Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
2023-10-12 20:12 ` [PATCH v2 1/4] x86/percpu: Introduce PER_CPU_ARG and use it in cmpxchg{8,16}b_emu.S Uros Bizjak
2023-10-12 20:12 ` [PATCH v2 2/4] x86/percpu: Correct PER_CPU_VAR usage to include symbol and its addend Uros Bizjak
2023-10-12 20:12 ` [PATCH v2 3/4] x86/percpu, xen: " Uros Bizjak
2023-10-12 20:12 ` [PATCH v2 4/4] x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR macro Uros Bizjak
2023-10-12 20:53 ` [PATCH v2 0/4] " Dave Hansen
2023-10-12 20:59   ` Uros Bizjak
2023-10-12 21:08     ` H. Peter Anvin
2023-10-12 21:17       ` Uros Bizjak
2023-10-12 21:21         ` H. Peter Anvin
2023-10-12 22:44           ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox