From: Peter Zijlstra <peterz@infradead.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org, peterz@infradead.org,
x86@kernel.org, Linus Torvalds <torvalds@linux-foundation.org>,
Tim Chen <tim.c.chen@linux.intel.com>,
Josh Poimboeuf <jpoimboe@kernel.org>,
Andrew Cooper <Andrew.Cooper3@citrix.com>,
Pawan Gupta <pawan.kumar.gupta@linux.intel.com>,
Johannes Wikner <kwikner@ethz.ch>,
Alyssa Milburn <alyssa.milburn@linux.intel.com>,
Jann Horn <jannh@google.com>, "H.J. Lu" <hjl.tools@gmail.com>,
Joao Moreira <joao.moreira@intel.com>,
Joseph Nuzman <joseph.nuzman@intel.com>,
Steven Rostedt <rostedt@goodmis.org>,
Juergen Gross <jgross@suse.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Eric Dumazet <edumazet@google.com>
Subject: [PATCH v2 08/59] x86/build: Ensure proper function alignment
Date: Fri, 02 Sep 2022 15:06:33 +0200 [thread overview]
Message-ID: <20220902130947.190618587@infradead.org> (raw)
In-Reply-To: 20220902130625.217071627@infradead.org
From: Thomas Gleixner <tglx@linutronix.de>
The Intel Architectures Optimization Reference Manual explains that
functions should be aligned at 16 bytes because for a lot of (Intel)
uarchs the I-fetch width is 16 bytes. The AMD Software Optimization
Guide (for recent chips) mentions a 32 byte I-fetch window but a 16
byte decode window.
Follow this advice and align functions to 16 bytes to optimize
instruction delivery to decode and reduce front-end bottlenecks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/Kconfig.cpu | 6 ++++++
arch/x86/Makefile | 4 ++++
arch/x86/include/asm/linkage.h | 7 ++++---
include/asm-generic/vmlinux.lds.h | 7 ++++++-
4 files changed, 20 insertions(+), 4 deletions(-)
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -517,3 +517,9 @@ config CPU_SUP_VORTEX_32
makes the kernel a tiny bit smaller.
If unsure, say N.
+
+# Defined here so it is defined for UM too
+config FUNCTION_ALIGNMENT
+ int
+ default 16 if X86_64 || X86_ALIGNMENT_16
+ default 8
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -84,6 +84,10 @@ else
KBUILD_CFLAGS += $(call cc-option,-fcf-protection=none)
endif
+ifneq ($(CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B),y)
+KBUILD_CFLAGS += -falign-functions=$(CONFIG_FUNCTION_ALIGNMENT)
+endif
+
ifeq ($(CONFIG_X86_32),y)
BITS := 32
UTS_MACHINE := i386
--- a/arch/x86/include/asm/linkage.h
+++ b/arch/x86/include/asm/linkage.h
@@ -14,9 +14,10 @@
#ifdef __ASSEMBLY__
-#if defined(CONFIG_X86_64) || defined(CONFIG_X86_ALIGNMENT_16)
-#define __ALIGN .p2align 4, 0x90
-#define __ALIGN_STR __stringify(__ALIGN)
+#if CONFIG_FUNCTION_ALIGNMENT == 16
+#define __ALIGN .p2align 4, 0x90
+#define __ALIGN_STR __stringify(__ALIGN)
+#define FUNCTION_ALIGNMENT 16
#endif
#if defined(CONFIG_RETHUNK) && !defined(__DISABLE_EXPORTS) && !defined(BUILD_VDSO)
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -82,7 +82,12 @@
#endif
/* Align . to a 8 byte boundary equals to maximum function alignment. */
-#define ALIGN_FUNCTION() . = ALIGN(8)
+#ifndef CONFIG_FUNCTION_ALIGNMENT
+#define __FUNCTION_ALIGNMENT 8
+#else
+#define __FUNCTION_ALIGNMENT CONFIG_FUNCTION_ALIGNMENT
+#endif
+#define ALIGN_FUNCTION() . = ALIGN(__FUNCTION_ALIGNMENT)
/*
* LD_DEAD_CODE_DATA_ELIMINATION option enables -fdata-sections, which
next prev parent reply other threads:[~2022-09-02 14:30 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-02 13:06 [PATCH v2 00/59] x86/retbleed: Call depth tracking mitigation Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 01/59] x86/paravirt: Ensure proper alignment Peter Zijlstra
2022-09-02 16:05 ` Juergen Gross
2022-09-02 13:06 ` [PATCH v2 02/59] x86/cpu: Remove segment load from switch_to_new_gdt() Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 03/59] x86/cpu: Get rid of redundant switch_to_new_gdt() invocations Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 04/59] x86/cpu: Re-enable stackprotector Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 05/59] x86/modules: Set VM_FLUSH_RESET_PERMS in module_alloc() Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 06/59] x86/vdso: Ensure all kernel code is seen by objtool Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 07/59] x86: Sanitize linker script Peter Zijlstra
2022-09-02 13:06 ` Peter Zijlstra [this message]
2022-09-02 16:51 ` [PATCH v2 08/59] x86/build: Ensure proper function alignment Linus Torvalds
2022-09-02 17:32 ` Peter Zijlstra
2022-09-02 18:08 ` Linus Torvalds
2022-09-05 10:04 ` Peter Zijlstra
2022-09-12 14:09 ` Linus Torvalds
2022-09-12 19:44 ` Peter Zijlstra
2022-09-13 8:08 ` Peter Zijlstra
2022-09-13 13:08 ` Linus Torvalds
2022-09-05 2:09 ` David Laight
2022-09-02 13:06 ` [PATCH v2 09/59] x86/asm: " Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 10/59] x86/error_inject: Align function properly Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 11/59] x86/paravirt: Properly align PV functions Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 12/59] x86/entry: Align SYM_CODE_START() variants Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 13/59] crypto: x86/camellia: Remove redundant alignments Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 14/59] crypto: x86/cast5: " Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 15/59] crypto: x86/crct10dif-pcl: " Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 16/59] crypto: x86/serpent: " Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 17/59] crypto: x86/sha1: Remove custom alignments Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 18/59] crypto: x86/sha256: " Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 19/59] crypto: x86/sm[34]: Remove redundant alignments Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 20/59] crypto: twofish: " Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 21/59] crypto: x86/poly1305: Remove custom function alignment Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 22/59] x86: Put hot per CPU variables into a struct Peter Zijlstra
2022-09-02 18:02 ` Jann Horn
2022-09-15 11:22 ` Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 23/59] x86/percpu: Move preempt_count next to current_task Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 24/59] x86/percpu: Move cpu_number " Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 25/59] x86/percpu: Move current_top_of_stack " Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 26/59] x86/percpu: Move irq_stack variables " Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 27/59] x86/softirq: Move softirq pending next to current task Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 28/59] objtool: Allow !PC relative relocations Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 29/59] objtool: Track init section Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 30/59] objtool: Add .call_sites section Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 31/59] objtool: Add --hacks=skylake Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 32/59] objtool: Allow STT_NOTYPE -> STT_FUNC+0 tail-calls Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 33/59] objtool: Fix find_{symbol,func}_containing() Peter Zijlstra
2022-09-02 13:06 ` [PATCH v2 34/59] objtool: Allow symbol range comparisons for IBT/ENDBR Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 35/59] x86/entry: Make sync_regs() invocation a tail call Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 36/59] ftrace: Add HAVE_DYNAMIC_FTRACE_NO_PATCHABLE Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 37/59] x86/putuser: Provide room for padding Peter Zijlstra
2022-09-02 16:43 ` Linus Torvalds
2022-09-02 17:03 ` Peter Zijlstra
2022-09-02 20:24 ` Peter Zijlstra
2022-09-02 21:46 ` Linus Torvalds
2022-09-03 17:26 ` Linus Torvalds
2022-09-05 7:16 ` Peter Zijlstra
2022-09-05 11:26 ` Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 38/59] x86/Kconfig: Add CONFIG_CALL_THUNKS Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 39/59] x86/Kconfig: Introduce function padding Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 40/59] x86/retbleed: Add X86_FEATURE_CALL_DEPTH Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 41/59] x86/alternatives: Provide text_poke_copy_locked() Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 42/59] x86/entry: Make some entry symbols global Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 43/59] x86/paravirt: Make struct paravirt_call_site unconditionally available Peter Zijlstra
2022-09-02 16:09 ` Juergen Gross
2022-09-02 13:07 ` [PATCH v2 44/59] x86/callthunks: Add call patching for call depth tracking Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 45/59] x86/modules: Add call patching Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 46/59] x86/returnthunk: Allow different return thunks Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 47/59] x86/asm: Provide ALTERNATIVE_3 Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 48/59] x86/retbleed: Add SKL return thunk Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 49/59] x86/retpoline: Add SKL retthunk retpolines Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 50/59] x86/retbleed: Add SKL call thunk Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 51/59] x86/calldepth: Add ret/call counting for debug Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 52/59] static_call: Add call depth tracking support Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 53/59] kallsyms: Take callthunks into account Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 54/59] x86/orc: Make it callthunk aware Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 55/59] x86/bpf: Emit call depth accounting if required Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 56/59] x86/ftrace: Remove ftrace_epilogue() Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 57/59] x86/ftrace: Rebalance RSB Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 58/59] x86/ftrace: Make it call depth tracking aware Peter Zijlstra
2022-09-02 13:07 ` [PATCH v2 59/59] x86/retbleed: Add call depth tracking mitigation Peter Zijlstra
2022-09-16 9:35 ` [PATCH v2 00/59] x86/retbleed: Call " Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220902130947.190618587@infradead.org \
--to=peterz@infradead.org \
--cc=Andrew.Cooper3@citrix.com \
--cc=alyssa.milburn@linux.intel.com \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=edumazet@google.com \
--cc=hjl.tools@gmail.com \
--cc=jannh@google.com \
--cc=jgross@suse.com \
--cc=joao.moreira@intel.com \
--cc=joseph.nuzman@intel.com \
--cc=jpoimboe@kernel.org \
--cc=kprateek.nayak@amd.com \
--cc=kwikner@ethz.ch \
--cc=linux-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=pawan.kumar.gupta@linux.intel.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
--cc=torvalds@linux-foundation.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox