netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
       [not found] <20170421141305.25180-1-jslaby@suse.cz>
@ 2017-04-21 14:12 ` Jiri Slaby
  2017-04-21 19:32   ` Alexei Starovoitov
  2017-04-21 14:13 ` [PATCH v3 26/29] x86_64: assembly, change all ENTRY to SYM_FUNC_START Jiri Slaby
  1 sibling, 1 reply; 17+ messages in thread
From: Jiri Slaby @ 2017-04-21 14:12 UTC (permalink / raw)
  To: mingo
  Cc: tglx, hpa, x86, jpoimboe, linux-kernel, Jiri Slaby,
	David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev

Do not use a custom macro FUNC for starts of the global functions, use
ENTRY instead.

And while at it, annotate also ends of the functions by ENDPROC.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: netdev@vger.kernel.org
---
 arch/x86/net/bpf_jit.S | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
index f2a7faf4706e..762c29fb8832 100644
--- a/arch/x86/net/bpf_jit.S
+++ b/arch/x86/net/bpf_jit.S
@@ -23,16 +23,12 @@
 	32 /* space for rbx,r13,r14,r15 */ + \
 	8 /* space for skb_copy_bits */)
 
-#define FUNC(name) \
-	.globl name; \
-	.type name, @function; \
-	name:
-
-FUNC(sk_load_word)
+ENTRY(sk_load_word)
 	test	%esi,%esi
 	js	bpf_slow_path_word_neg
+ENDPROC(sk_load_word)
 
-FUNC(sk_load_word_positive_offset)
+ENTRY(sk_load_word_positive_offset)
 	mov	%r9d,%eax		# hlen
 	sub	%esi,%eax		# hlen - offset
 	cmp	$3,%eax
@@ -40,12 +36,14 @@ FUNC(sk_load_word_positive_offset)
 	mov     (SKBDATA,%rsi),%eax
 	bswap   %eax  			/* ntohl() */
 	ret
+ENDPROC(sk_load_word_positive_offset)
 
-FUNC(sk_load_half)
+ENTRY(sk_load_half)
 	test	%esi,%esi
 	js	bpf_slow_path_half_neg
+ENDPROC(sk_load_half)
 
-FUNC(sk_load_half_positive_offset)
+ENTRY(sk_load_half_positive_offset)
 	mov	%r9d,%eax
 	sub	%esi,%eax		#	hlen - offset
 	cmp	$1,%eax
@@ -53,16 +51,19 @@ FUNC(sk_load_half_positive_offset)
 	movzwl	(SKBDATA,%rsi),%eax
 	rol	$8,%ax			# ntohs()
 	ret
+ENDPROC(sk_load_half_positive_offset)
 
-FUNC(sk_load_byte)
+ENTRY(sk_load_byte)
 	test	%esi,%esi
 	js	bpf_slow_path_byte_neg
+ENDPROC(sk_load_byte)
 
-FUNC(sk_load_byte_positive_offset)
+ENTRY(sk_load_byte_positive_offset)
 	cmp	%esi,%r9d   /* if (offset >= hlen) goto bpf_slow_path_byte */
 	jle	bpf_slow_path_byte
 	movzbl	(SKBDATA,%rsi),%eax
 	ret
+ENDPROC(sk_load_byte_positive_offset)
 
 /* rsi contains offset and can be scratched */
 #define bpf_slow_path_common(LEN)		\
@@ -119,31 +120,34 @@ bpf_slow_path_word_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi	/* test range */
 	jl	bpf_error	/* offset lower -> error  */
 
-FUNC(sk_load_word_negative_offset)
+ENTRY(sk_load_word_negative_offset)
 	sk_negative_common(4)
 	mov	(%rax), %eax
 	bswap	%eax
 	ret
+ENDPROC(sk_load_word_negative_offset)
 
 bpf_slow_path_half_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi
 	jl	bpf_error
 
-FUNC(sk_load_half_negative_offset)
+ENTRY(sk_load_half_negative_offset)
 	sk_negative_common(2)
 	mov	(%rax),%ax
 	rol	$8,%ax
 	movzwl	%ax,%eax
 	ret
+ENDPROC(sk_load_half_negative_offset)
 
 bpf_slow_path_byte_neg:
 	cmp	SKF_MAX_NEG_OFF, %esi
 	jl	bpf_error
 
-FUNC(sk_load_byte_negative_offset)
+ENTRY(sk_load_byte_negative_offset)
 	sk_negative_common(1)
 	movzbl	(%rax), %eax
 	ret
+ENDPROC(sk_load_byte_negative_offset)
 
 bpf_error:
 # force a return 0 from jit handler
-- 
2.12.2

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 26/29] x86_64: assembly, change all ENTRY to SYM_FUNC_START
       [not found] <20170421141305.25180-1-jslaby@suse.cz>
  2017-04-21 14:12 ` [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC Jiri Slaby
@ 2017-04-21 14:13 ` Jiri Slaby
  1 sibling, 0 replies; 17+ messages in thread
From: Jiri Slaby @ 2017-04-21 14:13 UTC (permalink / raw)
  To: mingo
  Cc: linux-efi, Matt Fleming, Pavel Machek, hpa, Jiri Slaby, tglx,
	Herbert Xu, x86, James Morris, xen-devel, Bill Metzenthen,
	Alexey Kuznetsov, Len Brown, linux-pm, jpoimboe, Boris Ostrovsky,
	Juergen Gross, Ard Biesheuvel, Hideaki YOSHIFUJI, netdev,
	Rafael J. Wysocki, linux-kernel, Patrick McHardy, linux-crypto,
	David S. Miller

These are all functions which are invoked from elsewhere, so we annotate
them as global using the new SYM_FUNC_START (and their ENDPROC's by
SYM_FUNC_END.)

And make sure ENTRY/ENDPROC is not defined on X86_64.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: x86@kernel.org
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Bill Metzenthen <billm@melbpc.org.au>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-efi@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Cc: netdev@vger.kernel.org
---
 arch/x86/boot/compressed/efi_thunk_64.S            |  4 +-
 arch/x86/boot/compressed/head_64.S                 | 20 ++++----
 arch/x86/boot/copy.S                               | 16 +++---
 arch/x86/boot/pmjump.S                             |  4 +-
 arch/x86/crypto/aes-i586-asm_32.S                  |  8 +--
 arch/x86/crypto/aes-x86_64-asm_64.S                |  4 +-
 arch/x86/crypto/aes_ctrby8_avx-x86_64.S            | 12 ++---
 arch/x86/crypto/aesni-intel_asm.S                  | 44 ++++++++--------
 arch/x86/crypto/aesni-intel_avx-x86_64.S           | 24 ++++-----
 arch/x86/crypto/blowfish-x86_64-asm_64.S           | 16 +++---
 arch/x86/crypto/camellia-aesni-avx-asm_64.S        | 24 ++++-----
 arch/x86/crypto/camellia-aesni-avx2-asm_64.S       | 24 ++++-----
 arch/x86/crypto/camellia-x86_64-asm_64.S           | 16 +++---
 arch/x86/crypto/cast5-avx-x86_64-asm_64.S          | 16 +++---
 arch/x86/crypto/cast6-avx-x86_64-asm_64.S          | 24 ++++-----
 arch/x86/crypto/chacha20-avx2-x86_64.S             |  4 +-
 arch/x86/crypto/chacha20-ssse3-x86_64.S            |  8 +--
 arch/x86/crypto/crc32-pclmul_asm.S                 |  4 +-
 arch/x86/crypto/crc32c-pcl-intel-asm_64.S          |  4 +-
 arch/x86/crypto/crct10dif-pcl-asm_64.S             |  4 +-
 arch/x86/crypto/des3_ede-asm_64.S                  |  8 +--
 arch/x86/crypto/ghash-clmulni-intel_asm.S          |  8 +--
 arch/x86/crypto/poly1305-avx2-x86_64.S             |  4 +-
 arch/x86/crypto/poly1305-sse2-x86_64.S             |  8 +--
 arch/x86/crypto/salsa20-x86_64-asm_64.S            | 12 ++---
 arch/x86/crypto/serpent-avx-x86_64-asm_64.S        | 24 ++++-----
 arch/x86/crypto/serpent-avx2-asm_64.S              | 24 ++++-----
 arch/x86/crypto/serpent-sse2-x86_64-asm_64.S       |  8 +--
 arch/x86/crypto/sha1-mb/sha1_mb_mgr_flush_avx2.S   |  8 +--
 arch/x86/crypto/sha1-mb/sha1_mb_mgr_submit_avx2.S  |  4 +-
 arch/x86/crypto/sha1-mb/sha1_x8_avx2.S             |  4 +-
 arch/x86/crypto/sha1_avx2_x86_64_asm.S             |  4 +-
 arch/x86/crypto/sha1_ni_asm.S                      |  4 +-
 arch/x86/crypto/sha1_ssse3_asm.S                   |  4 +-
 arch/x86/crypto/sha256-avx-asm.S                   |  4 +-
 arch/x86/crypto/sha256-avx2-asm.S                  |  4 +-
 .../crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S    |  8 +--
 .../crypto/sha256-mb/sha256_mb_mgr_submit_avx2.S   |  4 +-
 arch/x86/crypto/sha256-mb/sha256_x8_avx2.S         |  4 +-
 arch/x86/crypto/sha256-ssse3-asm.S                 |  4 +-
 arch/x86/crypto/sha256_ni_asm.S                    |  4 +-
 arch/x86/crypto/sha512-avx-asm.S                   |  4 +-
 arch/x86/crypto/sha512-avx2-asm.S                  |  4 +-
 .../crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S    |  8 +--
 .../crypto/sha512-mb/sha512_mb_mgr_submit_avx2.S   |  4 +-
 arch/x86/crypto/sha512-mb/sha512_x4_avx2.S         |  4 +-
 arch/x86/crypto/sha512-ssse3-asm.S                 |  4 +-
 arch/x86/crypto/twofish-avx-x86_64-asm_64.S        | 24 ++++-----
 arch/x86/crypto/twofish-x86_64-asm_64-3way.S       |  8 +--
 arch/x86/crypto/twofish-x86_64-asm_64.S            |  8 +--
 arch/x86/entry/entry_64.S                          | 58 +++++++++++-----------
 arch/x86/entry/entry_64_compat.S                   | 16 +++---
 arch/x86/kernel/acpi/wakeup_64.S                   |  8 +--
 arch/x86/kernel/ftrace_64.S                        | 24 ++++-----
 arch/x86/kernel/head_64.S                          | 16 +++---
 arch/x86/lib/checksum_32.S                         |  8 +--
 arch/x86/lib/clear_page_64.S                       | 12 ++---
 arch/x86/lib/cmpxchg16b_emu.S                      |  4 +-
 arch/x86/lib/cmpxchg8b_emu.S                       |  4 +-
 arch/x86/lib/copy_page_64.S                        |  4 +-
 arch/x86/lib/copy_user_64.S                        | 16 +++---
 arch/x86/lib/csum-copy_64.S                        |  4 +-
 arch/x86/lib/getuser.S                             | 16 +++---
 arch/x86/lib/hweight.S                             |  8 +--
 arch/x86/lib/iomap_copy_64.S                       |  4 +-
 arch/x86/lib/memcpy_64.S                           |  4 +-
 arch/x86/lib/memmove_64.S                          |  4 +-
 arch/x86/lib/memset_64.S                           |  4 +-
 arch/x86/lib/msr-reg.S                             |  8 +--
 arch/x86/lib/putuser.S                             | 16 +++---
 arch/x86/lib/rwsem.S                               | 20 ++++----
 arch/x86/net/bpf_jit.S                             | 36 +++++++-------
 arch/x86/platform/efi/efi_stub_64.S                |  4 +-
 arch/x86/platform/efi/efi_thunk_64.S               |  4 +-
 arch/x86/platform/olpc/xo1-wakeup.S                |  4 +-
 arch/x86/power/hibernate_asm_64.S                  | 16 +++---
 arch/x86/realmode/rm/reboot.S                      |  4 +-
 arch/x86/realmode/rm/trampoline_64.S               | 12 ++---
 arch/x86/realmode/rm/wakeup_asm.S                  |  4 +-
 arch/x86/xen/xen-asm.S                             | 20 ++++----
 arch/x86/xen/xen-asm_64.S                          | 28 +++++------
 arch/x86/xen/xen-head.S                            |  8 +--
 include/linux/linkage.h                            |  4 ++
 83 files changed, 455 insertions(+), 451 deletions(-)

diff --git a/arch/x86/boot/compressed/efi_thunk_64.S b/arch/x86/boot/compressed/efi_thunk_64.S
index c072711d8d62..b85b49c36da3 100644
--- a/arch/x86/boot/compressed/efi_thunk_64.S
+++ b/arch/x86/boot/compressed/efi_thunk_64.S
@@ -22,7 +22,7 @@
 
 	.code64
 	.text
-ENTRY(efi64_thunk)
+SYM_FUNC_START(efi64_thunk)
 	push	%rbp
 	push	%rbx
 
@@ -96,7 +96,7 @@ ENTRY(efi64_thunk)
 	pop	%rbx
 	pop	%rbp
 	ret
-ENDPROC(efi64_thunk)
+SYM_FUNC_END(efi64_thunk)
 
 SYM_FUNC_START_LOCAL(efi_exit32)
 	movq	func_rt_ptr(%rip), %rax
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 6eb5a50a301e..34386ba83aef 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -43,7 +43,7 @@
 
 	__HEAD
 	.code32
-ENTRY(startup_32)
+SYM_FUNC_START(startup_32)
 	/*
 	 * 32bit entry is 0 and it is ABI so immutable!
 	 * If we come here directly from a bootloader,
@@ -204,11 +204,11 @@ ENTRY(startup_32)
 
 	/* Jump from 32bit compatibility mode into 64bit mode. */
 	lret
-ENDPROC(startup_32)
+SYM_FUNC_END(startup_32)
 
 #ifdef CONFIG_EFI_MIXED
 	.org 0x190
-ENTRY(efi32_stub_entry)
+SYM_FUNC_START(efi32_stub_entry)
 	add	$0x4, %esp		/* Discard return address */
 	popl	%ecx
 	popl	%edx
@@ -227,12 +227,12 @@ ENTRY(efi32_stub_entry)
 	movl	%eax, efi_config(%ebp)
 
 	jmp	startup_32
-ENDPROC(efi32_stub_entry)
+SYM_FUNC_END(efi32_stub_entry)
 #endif
 
 	.code64
 	.org 0x200
-ENTRY(startup_64)
+SYM_FUNC_START(startup_64)
 	/*
 	 * 64bit entry is 0x200 and it is ABI so immutable!
 	 * We come here either from startup_32 or directly from a
@@ -310,12 +310,12 @@ ENTRY(startup_64)
  */
 	leaq	relocated(%rbx), %rax
 	jmp	*%rax
-ENDPROC(startup_64)
+SYM_FUNC_END(startup_64)
 
 #ifdef CONFIG_EFI_STUB
 
 /* The entry point for the PE/COFF executable is efi_pe_entry. */
-ENTRY(efi_pe_entry)
+SYM_FUNC_START(efi_pe_entry)
 	movq	%rcx, efi64_config(%rip)	/* Handle */
 	movq	%rdx, efi64_config+8(%rip) /* EFI System table pointer */
 
@@ -364,10 +364,10 @@ fail:
 	movl	BP_code32_start(%esi), %eax
 	leaq	startup_64(%rax), %rax
 	jmp	*%rax
-ENDPROC(efi_pe_entry)
+SYM_FUNC_END(efi_pe_entry)
 
 	.org 0x390
-ENTRY(efi64_stub_entry)
+SYM_FUNC_START(efi64_stub_entry)
 	movq	%rdi, efi64_config(%rip)	/* Handle */
 	movq	%rsi, efi64_config+8(%rip) /* EFI System table pointer */
 
@@ -376,7 +376,7 @@ ENTRY(efi64_stub_entry)
 
 	movq	%rdx, %rsi
 	jmp	handover_entry
-ENDPROC(efi64_stub_entry)
+SYM_FUNC_END(efi64_stub_entry)
 #endif
 
 	.text
diff --git a/arch/x86/boot/copy.S b/arch/x86/boot/copy.S
index 030a7bde51da..d44ede19f729 100644
--- a/arch/x86/boot/copy.S
+++ b/arch/x86/boot/copy.S
@@ -17,7 +17,7 @@
 	.code16
 	.text
 
-ENTRY(memcpy)
+SYM_FUNC_START(memcpy)
 	pushw	%si
 	pushw	%di
 	movw	%ax, %di
@@ -31,9 +31,9 @@ ENTRY(memcpy)
 	popw	%di
 	popw	%si
 	retl
-ENDPROC(memcpy)
+SYM_FUNC_END(memcpy)
 
-ENTRY(memset)
+SYM_FUNC_START(memset)
 	pushw	%di
 	movw	%ax, %di
 	movzbl	%dl, %eax
@@ -46,22 +46,22 @@ ENTRY(memset)
 	rep; stosb
 	popw	%di
 	retl
-ENDPROC(memset)
+SYM_FUNC_END(memset)
 
-ENTRY(copy_from_fs)
+SYM_FUNC_START(copy_from_fs)
 	pushw	%ds
 	pushw	%fs
 	popw	%ds
 	calll	memcpy
 	popw	%ds
 	retl
-ENDPROC(copy_from_fs)
+SYM_FUNC_END(copy_from_fs)
 
-ENTRY(copy_to_fs)
+SYM_FUNC_START(copy_to_fs)
 	pushw	%es
 	pushw	%fs
 	popw	%es
 	calll	memcpy
 	popw	%es
 	retl
-ENDPROC(copy_to_fs)
+SYM_FUNC_END(copy_to_fs)
diff --git a/arch/x86/boot/pmjump.S b/arch/x86/boot/pmjump.S
index da86f4df8ffb..1aae6a3b38ab 100644
--- a/arch/x86/boot/pmjump.S
+++ b/arch/x86/boot/pmjump.S
@@ -23,7 +23,7 @@
 /*
  * void protected_mode_jump(u32 entrypoint, u32 bootparams);
  */
-ENTRY(protected_mode_jump)
+SYM_FUNC_START(protected_mode_jump)
 	movl	%edx, %esi		# Pointer to boot_params table
 
 	xorl	%ebx, %ebx
@@ -44,7 +44,7 @@ ENTRY(protected_mode_jump)
 	.byte	0x66, 0xea		# ljmpl opcode
 2:	.long	in_pm32			# offset
 	.word	__BOOT_CS		# segment
-ENDPROC(protected_mode_jump)
+SYM_FUNC_END(protected_mode_jump)
 
 	.code32
 	.section ".text32","ax"
diff --git a/arch/x86/crypto/aes-i586-asm_32.S b/arch/x86/crypto/aes-i586-asm_32.S
index 2849dbc59e11..5b2636c58527 100644
--- a/arch/x86/crypto/aes-i586-asm_32.S
+++ b/arch/x86/crypto/aes-i586-asm_32.S
@@ -223,7 +223,7 @@
 .extern  crypto_ft_tab
 .extern  crypto_fl_tab
 
-ENTRY(aes_enc_blk)
+SYM_FUNC_START(aes_enc_blk)
 	push    %ebp
 	mov     ctx(%esp),%ebp
 
@@ -287,7 +287,7 @@ ENTRY(aes_enc_blk)
 	mov     %r0,(%ebp)
 	pop     %ebp
 	ret
-ENDPROC(aes_enc_blk)
+SYM_FUNC_END(aes_enc_blk)
 
 // AES (Rijndael) Decryption Subroutine
 /* void aes_dec_blk(struct crypto_aes_ctx *ctx, u8 *out_blk, const u8 *in_blk) */
@@ -295,7 +295,7 @@ ENDPROC(aes_enc_blk)
 .extern  crypto_it_tab
 .extern  crypto_il_tab
 
-ENTRY(aes_dec_blk)
+SYM_FUNC_START(aes_dec_blk)
 	push    %ebp
 	mov     ctx(%esp),%ebp
 
@@ -359,4 +359,4 @@ ENTRY(aes_dec_blk)
 	mov     %r0,(%ebp)
 	pop     %ebp
 	ret
-ENDPROC(aes_dec_blk)
+SYM_FUNC_END(aes_dec_blk)
diff --git a/arch/x86/crypto/aes-x86_64-asm_64.S b/arch/x86/crypto/aes-x86_64-asm_64.S
index 910565547163..bd02ef4a10fe 100644
--- a/arch/x86/crypto/aes-x86_64-asm_64.S
+++ b/arch/x86/crypto/aes-x86_64-asm_64.S
@@ -50,7 +50,7 @@
 #define R11	%r11
 
 #define prologue(FUNC,KEY,B128,B192,r1,r2,r3,r4,r5,r6,r7,r8,r9,r10,r11) \
-	ENTRY(FUNC);			\
+	SYM_FUNC_START(FUNC);		\
 	movq	r1,r2;			\
 	movq	r3,r4;			\
 	leaq	KEY+48(r8),r9;		\
@@ -78,7 +78,7 @@
 	movl	r7 ## E,8(r9);		\
 	movl	r8 ## E,12(r9);		\
 	ret;				\
-	ENDPROC(FUNC);
+	SYM_FUNC_END(FUNC);
 
 #define round(TAB,OFFSET,r1,r2,r3,r4,r5,r6,r7,r8,ra,rb,rc,rd) \
 	movzbl	r2 ## H,r5 ## E;	\
diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
index 5f6a5af9c489..ec437db1fa54 100644
--- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
+++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
@@ -544,11 +544,11 @@ ddq_add_8:
  * aes_ctr_enc_128_avx_by8(void *in, void *iv, void *keys, void *out,
  *			unsigned int num_bytes)
  */
-ENTRY(aes_ctr_enc_128_avx_by8)
+SYM_FUNC_START(aes_ctr_enc_128_avx_by8)
 	/* call the aes main loop */
 	do_aes_ctrmain KEY_128
 
-ENDPROC(aes_ctr_enc_128_avx_by8)
+SYM_FUNC_END(aes_ctr_enc_128_avx_by8)
 
 /*
  * routine to do AES192 CTR enc/decrypt "by8"
@@ -557,11 +557,11 @@ ENDPROC(aes_ctr_enc_128_avx_by8)
  * aes_ctr_enc_192_avx_by8(void *in, void *iv, void *keys, void *out,
  *			unsigned int num_bytes)
  */
-ENTRY(aes_ctr_enc_192_avx_by8)
+SYM_FUNC_START(aes_ctr_enc_192_avx_by8)
 	/* call the aes main loop */
 	do_aes_ctrmain KEY_192
 
-ENDPROC(aes_ctr_enc_192_avx_by8)
+SYM_FUNC_END(aes_ctr_enc_192_avx_by8)
 
 /*
  * routine to do AES256 CTR enc/decrypt "by8"
@@ -570,8 +570,8 @@ ENDPROC(aes_ctr_enc_192_avx_by8)
  * aes_ctr_enc_256_avx_by8(void *in, void *iv, void *keys, void *out,
  *			unsigned int num_bytes)
  */
-ENTRY(aes_ctr_enc_256_avx_by8)
+SYM_FUNC_START(aes_ctr_enc_256_avx_by8)
 	/* call the aes main loop */
 	do_aes_ctrmain KEY_256
 
-ENDPROC(aes_ctr_enc_256_avx_by8)
+SYM_FUNC_END(aes_ctr_enc_256_avx_by8)
diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index 3469670df832..934d49f71b48 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -1301,7 +1301,7 @@ _esb_loop_\@:
 * poly = x^128 + x^127 + x^126 + x^121 + 1
 *
 *****************************************************************************/
-ENTRY(aesni_gcm_dec)
+SYM_FUNC_START(aesni_gcm_dec)
 	push	%r12
 	push	%r13
 	push	%r14
@@ -1475,7 +1475,7 @@ _return_T_done_decrypt:
 	pop	%r13
 	pop	%r12
 	ret
-ENDPROC(aesni_gcm_dec)
+SYM_FUNC_END(aesni_gcm_dec)
 
 
 /*****************************************************************************
@@ -1561,7 +1561,7 @@ ENDPROC(aesni_gcm_dec)
 *
 * poly = x^128 + x^127 + x^126 + x^121 + 1
 ***************************************************************************/
-ENTRY(aesni_gcm_enc)
+SYM_FUNC_START(aesni_gcm_enc)
 	push	%r12
 	push	%r13
 	push	%r14
@@ -1739,7 +1739,7 @@ _return_T_done_encrypt:
 	pop	%r13
 	pop	%r12
 	ret
-ENDPROC(aesni_gcm_enc)
+SYM_FUNC_END(aesni_gcm_enc)
 
 #endif
 
@@ -1817,7 +1817,7 @@ SYM_FUNC_END(_key_expansion_256b)
  * int aesni_set_key(struct crypto_aes_ctx *ctx, const u8 *in_key,
  *                   unsigned int key_len)
  */
-ENTRY(aesni_set_key)
+SYM_FUNC_START(aesni_set_key)
 	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
@@ -1926,12 +1926,12 @@ ENTRY(aesni_set_key)
 #endif
 	FRAME_END
 	ret
-ENDPROC(aesni_set_key)
+SYM_FUNC_END(aesni_set_key)
 
 /*
  * void aesni_enc(struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
  */
-ENTRY(aesni_enc)
+SYM_FUNC_START(aesni_enc)
 	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
@@ -1950,7 +1950,7 @@ ENTRY(aesni_enc)
 #endif
 	FRAME_END
 	ret
-ENDPROC(aesni_enc)
+SYM_FUNC_END(aesni_enc)
 
 /*
  * _aesni_enc1:		internal ABI
@@ -2120,7 +2120,7 @@ SYM_FUNC_END(_aesni_enc4)
 /*
  * void aesni_dec (struct crypto_aes_ctx *ctx, u8 *dst, const u8 *src)
  */
-ENTRY(aesni_dec)
+SYM_FUNC_START(aesni_dec)
 	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl KEYP
@@ -2140,7 +2140,7 @@ ENTRY(aesni_dec)
 #endif
 	FRAME_END
 	ret
-ENDPROC(aesni_dec)
+SYM_FUNC_END(aesni_dec)
 
 /*
  * _aesni_dec1:		internal ABI
@@ -2311,7 +2311,7 @@ SYM_FUNC_END(_aesni_dec4)
  * void aesni_ecb_enc(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src,
  *		      size_t len)
  */
-ENTRY(aesni_ecb_enc)
+SYM_FUNC_START(aesni_ecb_enc)
 	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl LEN
@@ -2365,13 +2365,13 @@ ENTRY(aesni_ecb_enc)
 #endif
 	FRAME_END
 	ret
-ENDPROC(aesni_ecb_enc)
+SYM_FUNC_END(aesni_ecb_enc)
 
 /*
  * void aesni_ecb_dec(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src,
  *		      size_t len);
  */
-ENTRY(aesni_ecb_dec)
+SYM_FUNC_START(aesni_ecb_dec)
 	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl LEN
@@ -2426,13 +2426,13 @@ ENTRY(aesni_ecb_dec)
 #endif
 	FRAME_END
 	ret
-ENDPROC(aesni_ecb_dec)
+SYM_FUNC_END(aesni_ecb_dec)
 
 /*
  * void aesni_cbc_enc(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src,
  *		      size_t len, u8 *iv)
  */
-ENTRY(aesni_cbc_enc)
+SYM_FUNC_START(aesni_cbc_enc)
 	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl IVP
@@ -2470,13 +2470,13 @@ ENTRY(aesni_cbc_enc)
 #endif
 	FRAME_END
 	ret
-ENDPROC(aesni_cbc_enc)
+SYM_FUNC_END(aesni_cbc_enc)
 
 /*
  * void aesni_cbc_dec(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src,
  *		      size_t len, u8 *iv)
  */
-ENTRY(aesni_cbc_dec)
+SYM_FUNC_START(aesni_cbc_dec)
 	FRAME_BEGIN
 #ifndef __x86_64__
 	pushl IVP
@@ -2563,7 +2563,7 @@ ENTRY(aesni_cbc_dec)
 #endif
 	FRAME_END
 	ret
-ENDPROC(aesni_cbc_dec)
+SYM_FUNC_END(aesni_cbc_dec)
 
 #ifdef __x86_64__
 .pushsection .rodata
@@ -2625,7 +2625,7 @@ SYM_FUNC_END(_aesni_inc)
  * void aesni_ctr_enc(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src,
  *		      size_t len, u8 *iv)
  */
-ENTRY(aesni_ctr_enc)
+SYM_FUNC_START(aesni_ctr_enc)
 	FRAME_BEGIN
 	cmp $16, LEN
 	jb .Lctr_enc_just_ret
@@ -2682,7 +2682,7 @@ ENTRY(aesni_ctr_enc)
 .Lctr_enc_just_ret:
 	FRAME_END
 	ret
-ENDPROC(aesni_ctr_enc)
+SYM_FUNC_END(aesni_ctr_enc)
 
 /*
  * _aesni_gf128mul_x_ble:		internal ABI
@@ -2706,7 +2706,7 @@ ENDPROC(aesni_ctr_enc)
  * void aesni_xts_crypt8(struct crypto_aes_ctx *ctx, const u8 *dst, u8 *src,
  *			 bool enc, u8 *iv)
  */
-ENTRY(aesni_xts_crypt8)
+SYM_FUNC_START(aesni_xts_crypt8)
 	FRAME_BEGIN
 	cmpb $0, %cl
 	movl $0, %ecx
@@ -2810,6 +2810,6 @@ ENTRY(aesni_xts_crypt8)
 
 	FRAME_END
 	ret
-ENDPROC(aesni_xts_crypt8)
+SYM_FUNC_END(aesni_xts_crypt8)
 
 #endif
diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index d664382c6e56..2a5228b675b2 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -1460,7 +1460,7 @@ _return_T_done\@:
 #        (gcm_data     *my_ctx_data,
 #        u8     *hash_subkey)# /* H, the Hash sub key input. Data starts on a 16-byte boundary. */
 #############################################################
-ENTRY(aesni_gcm_precomp_avx_gen2)
+SYM_FUNC_START(aesni_gcm_precomp_avx_gen2)
         #the number of pushes must equal STACK_OFFSET
         push    %r12
         push    %r13
@@ -1503,7 +1503,7 @@ ENTRY(aesni_gcm_precomp_avx_gen2)
         pop     %r13
         pop     %r12
         ret
-ENDPROC(aesni_gcm_precomp_avx_gen2)
+SYM_FUNC_END(aesni_gcm_precomp_avx_gen2)
 
 ###############################################################################
 #void   aesni_gcm_enc_avx_gen2(
@@ -1521,10 +1521,10 @@ ENDPROC(aesni_gcm_precomp_avx_gen2)
 #        u64     auth_tag_len)# /* Authenticated Tag Length in bytes.
 #				Valid values are 16 (most likely), 12 or 8. */
 ###############################################################################
-ENTRY(aesni_gcm_enc_avx_gen2)
+SYM_FUNC_START(aesni_gcm_enc_avx_gen2)
         GCM_ENC_DEC_AVX     ENC
 	ret
-ENDPROC(aesni_gcm_enc_avx_gen2)
+SYM_FUNC_END(aesni_gcm_enc_avx_gen2)
 
 ###############################################################################
 #void   aesni_gcm_dec_avx_gen2(
@@ -1542,10 +1542,10 @@ ENDPROC(aesni_gcm_enc_avx_gen2)
 #        u64     auth_tag_len)# /* Authenticated Tag Length in bytes.
 #				Valid values are 16 (most likely), 12 or 8. */
 ###############################################################################
-ENTRY(aesni_gcm_dec_avx_gen2)
+SYM_FUNC_START(aesni_gcm_dec_avx_gen2)
         GCM_ENC_DEC_AVX     DEC
 	ret
-ENDPROC(aesni_gcm_dec_avx_gen2)
+SYM_FUNC_END(aesni_gcm_dec_avx_gen2)
 #endif /* CONFIG_AS_AVX */
 
 #ifdef CONFIG_AS_AVX2
@@ -2736,7 +2736,7 @@ _return_T_done\@:
 #        u8     *hash_subkey)# /* H, the Hash sub key input.
 #				Data starts on a 16-byte boundary. */
 #############################################################
-ENTRY(aesni_gcm_precomp_avx_gen4)
+SYM_FUNC_START(aesni_gcm_precomp_avx_gen4)
         #the number of pushes must equal STACK_OFFSET
         push    %r12
         push    %r13
@@ -2779,7 +2779,7 @@ ENTRY(aesni_gcm_precomp_avx_gen4)
         pop     %r13
         pop     %r12
         ret
-ENDPROC(aesni_gcm_precomp_avx_gen4)
+SYM_FUNC_END(aesni_gcm_precomp_avx_gen4)
 
 
 ###############################################################################
@@ -2798,10 +2798,10 @@ ENDPROC(aesni_gcm_precomp_avx_gen4)
 #        u64     auth_tag_len)# /* Authenticated Tag Length in bytes.
 #				Valid values are 16 (most likely), 12 or 8. */
 ###############################################################################
-ENTRY(aesni_gcm_enc_avx_gen4)
+SYM_FUNC_START(aesni_gcm_enc_avx_gen4)
         GCM_ENC_DEC_AVX2     ENC
 	ret
-ENDPROC(aesni_gcm_enc_avx_gen4)
+SYM_FUNC_END(aesni_gcm_enc_avx_gen4)
 
 ###############################################################################
 #void   aesni_gcm_dec_avx_gen4(
@@ -2819,9 +2819,9 @@ ENDPROC(aesni_gcm_enc_avx_gen4)
 #        u64     auth_tag_len)# /* Authenticated Tag Length in bytes.
 #				Valid values are 16 (most likely), 12 or 8. */
 ###############################################################################
-ENTRY(aesni_gcm_dec_avx_gen4)
+SYM_FUNC_START(aesni_gcm_dec_avx_gen4)
         GCM_ENC_DEC_AVX2     DEC
 	ret
-ENDPROC(aesni_gcm_dec_avx_gen4)
+SYM_FUNC_END(aesni_gcm_dec_avx_gen4)
 
 #endif /* CONFIG_AS_AVX2 */
diff --git a/arch/x86/crypto/blowfish-x86_64-asm_64.S b/arch/x86/crypto/blowfish-x86_64-asm_64.S
index 246c67006ed0..fc96846d5707 100644
--- a/arch/x86/crypto/blowfish-x86_64-asm_64.S
+++ b/arch/x86/crypto/blowfish-x86_64-asm_64.S
@@ -118,7 +118,7 @@
 	bswapq 			RX0; \
 	xorq RX0, 		(RIO);
 
-ENTRY(__blowfish_enc_blk)
+SYM_FUNC_START(__blowfish_enc_blk)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -153,9 +153,9 @@ ENTRY(__blowfish_enc_blk)
 .L__enc_xor:
 	xor_block();
 	ret;
-ENDPROC(__blowfish_enc_blk)
+SYM_FUNC_END(__blowfish_enc_blk)
 
-ENTRY(blowfish_dec_blk)
+SYM_FUNC_START(blowfish_dec_blk)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -184,7 +184,7 @@ ENTRY(blowfish_dec_blk)
 	movq %r11, %rbp;
 
 	ret;
-ENDPROC(blowfish_dec_blk)
+SYM_FUNC_END(blowfish_dec_blk)
 
 /**********************************************************************
   4-way blowfish, four blocks parallel
@@ -296,7 +296,7 @@ ENDPROC(blowfish_dec_blk)
 	bswapq 			RX3; \
 	xorq RX3,		24(RIO);
 
-ENTRY(__blowfish_enc_blk_4way)
+SYM_FUNC_START(__blowfish_enc_blk_4way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -342,9 +342,9 @@ ENTRY(__blowfish_enc_blk_4way)
 	popq %rbx;
 	popq %rbp;
 	ret;
-ENDPROC(__blowfish_enc_blk_4way)
+SYM_FUNC_END(__blowfish_enc_blk_4way)
 
-ENTRY(blowfish_dec_blk_4way)
+SYM_FUNC_START(blowfish_dec_blk_4way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -376,4 +376,4 @@ ENTRY(blowfish_dec_blk_4way)
 	popq %rbp;
 
 	ret;
-ENDPROC(blowfish_dec_blk_4way)
+SYM_FUNC_END(blowfish_dec_blk_4way)
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index 8b6a65524067..c055e844203f 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -892,7 +892,7 @@ SYM_FUNC_START_LOCAL(__camellia_dec_blk16)
 	jmp .Ldec_max24;
 SYM_FUNC_END(__camellia_dec_blk16)
 
-ENTRY(camellia_ecb_enc_16way)
+SYM_FUNC_START(camellia_ecb_enc_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (16 blocks)
@@ -915,9 +915,9 @@ ENTRY(camellia_ecb_enc_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(camellia_ecb_enc_16way)
+SYM_FUNC_END(camellia_ecb_enc_16way)
 
-ENTRY(camellia_ecb_dec_16way)
+SYM_FUNC_START(camellia_ecb_dec_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (16 blocks)
@@ -945,9 +945,9 @@ ENTRY(camellia_ecb_dec_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(camellia_ecb_dec_16way)
+SYM_FUNC_END(camellia_ecb_dec_16way)
 
-ENTRY(camellia_cbc_dec_16way)
+SYM_FUNC_START(camellia_cbc_dec_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (16 blocks)
@@ -996,7 +996,7 @@ ENTRY(camellia_cbc_dec_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(camellia_cbc_dec_16way)
+SYM_FUNC_END(camellia_cbc_dec_16way)
 
 #define inc_le128(x, minus_one, tmp) \
 	vpcmpeqq minus_one, x, tmp; \
@@ -1004,7 +1004,7 @@ ENDPROC(camellia_cbc_dec_16way)
 	vpslldq $8, tmp, tmp; \
 	vpsubq tmp, x, x;
 
-ENTRY(camellia_ctr_16way)
+SYM_FUNC_START(camellia_ctr_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (16 blocks)
@@ -1109,7 +1109,7 @@ ENTRY(camellia_ctr_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(camellia_ctr_16way)
+SYM_FUNC_END(camellia_ctr_16way)
 
 #define gf128mul_x_ble(iv, mask, tmp) \
 	vpsrad $31, iv, tmp; \
@@ -1255,7 +1255,7 @@ SYM_FUNC_START_LOCAL(camellia_xts_crypt_16way)
 	ret;
 SYM_FUNC_END(camellia_xts_crypt_16way)
 
-ENTRY(camellia_xts_enc_16way)
+SYM_FUNC_START(camellia_xts_enc_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (16 blocks)
@@ -1267,9 +1267,9 @@ ENTRY(camellia_xts_enc_16way)
 	leaq __camellia_enc_blk16, %r9;
 
 	jmp camellia_xts_crypt_16way;
-ENDPROC(camellia_xts_enc_16way)
+SYM_FUNC_END(camellia_xts_enc_16way)
 
-ENTRY(camellia_xts_dec_16way)
+SYM_FUNC_START(camellia_xts_dec_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (16 blocks)
@@ -1285,4 +1285,4 @@ ENTRY(camellia_xts_dec_16way)
 	leaq __camellia_dec_blk16, %r9;
 
 	jmp camellia_xts_crypt_16way;
-ENDPROC(camellia_xts_dec_16way)
+SYM_FUNC_END(camellia_xts_dec_16way)
diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index 96b44ad85c59..0c1af357c86d 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -935,7 +935,7 @@ SYM_FUNC_START_LOCAL(__camellia_dec_blk32)
 	jmp .Ldec_max24;
 SYM_FUNC_END(__camellia_dec_blk32)
 
-ENTRY(camellia_ecb_enc_32way)
+SYM_FUNC_START(camellia_ecb_enc_32way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (32 blocks)
@@ -962,9 +962,9 @@ ENTRY(camellia_ecb_enc_32way)
 
 	FRAME_END
 	ret;
-ENDPROC(camellia_ecb_enc_32way)
+SYM_FUNC_END(camellia_ecb_enc_32way)
 
-ENTRY(camellia_ecb_dec_32way)
+SYM_FUNC_START(camellia_ecb_dec_32way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (32 blocks)
@@ -996,9 +996,9 @@ ENTRY(camellia_ecb_dec_32way)
 
 	FRAME_END
 	ret;
-ENDPROC(camellia_ecb_dec_32way)
+SYM_FUNC_END(camellia_ecb_dec_32way)
 
-ENTRY(camellia_cbc_dec_32way)
+SYM_FUNC_START(camellia_cbc_dec_32way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (32 blocks)
@@ -1064,7 +1064,7 @@ ENTRY(camellia_cbc_dec_32way)
 
 	FRAME_END
 	ret;
-ENDPROC(camellia_cbc_dec_32way)
+SYM_FUNC_END(camellia_cbc_dec_32way)
 
 #define inc_le128(x, minus_one, tmp) \
 	vpcmpeqq minus_one, x, tmp; \
@@ -1080,7 +1080,7 @@ ENDPROC(camellia_cbc_dec_32way)
 	vpslldq $8, tmp1, tmp1; \
 	vpsubq tmp1, x, x;
 
-ENTRY(camellia_ctr_32way)
+SYM_FUNC_START(camellia_ctr_32way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (32 blocks)
@@ -1204,7 +1204,7 @@ ENTRY(camellia_ctr_32way)
 
 	FRAME_END
 	ret;
-ENDPROC(camellia_ctr_32way)
+SYM_FUNC_END(camellia_ctr_32way)
 
 #define gf128mul_x_ble(iv, mask, tmp) \
 	vpsrad $31, iv, tmp; \
@@ -1373,7 +1373,7 @@ SYM_FUNC_START_LOCAL(camellia_xts_crypt_32way)
 	ret;
 SYM_FUNC_END(camellia_xts_crypt_32way)
 
-ENTRY(camellia_xts_enc_32way)
+SYM_FUNC_START(camellia_xts_enc_32way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (32 blocks)
@@ -1386,9 +1386,9 @@ ENTRY(camellia_xts_enc_32way)
 	leaq __camellia_enc_blk32, %r9;
 
 	jmp camellia_xts_crypt_32way;
-ENDPROC(camellia_xts_enc_32way)
+SYM_FUNC_END(camellia_xts_enc_32way)
 
-ENTRY(camellia_xts_dec_32way)
+SYM_FUNC_START(camellia_xts_dec_32way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (32 blocks)
@@ -1404,4 +1404,4 @@ ENTRY(camellia_xts_dec_32way)
 	leaq __camellia_dec_blk32, %r9;
 
 	jmp camellia_xts_crypt_32way;
-ENDPROC(camellia_xts_dec_32way)
+SYM_FUNC_END(camellia_xts_dec_32way)
diff --git a/arch/x86/crypto/camellia-x86_64-asm_64.S b/arch/x86/crypto/camellia-x86_64-asm_64.S
index 310319c601ed..1886ea733a76 100644
--- a/arch/x86/crypto/camellia-x86_64-asm_64.S
+++ b/arch/x86/crypto/camellia-x86_64-asm_64.S
@@ -190,7 +190,7 @@
 	bswapq				RAB0; \
 	movq RAB0,			4*2(RIO);
 
-ENTRY(__camellia_enc_blk)
+SYM_FUNC_START(__camellia_enc_blk)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -235,9 +235,9 @@ ENTRY(__camellia_enc_blk)
 
 	movq RRBP, %rbp;
 	ret;
-ENDPROC(__camellia_enc_blk)
+SYM_FUNC_END(__camellia_enc_blk)
 
-ENTRY(camellia_dec_blk)
+SYM_FUNC_START(camellia_dec_blk)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -273,7 +273,7 @@ ENTRY(camellia_dec_blk)
 
 	movq RRBP, %rbp;
 	ret;
-ENDPROC(camellia_dec_blk)
+SYM_FUNC_END(camellia_dec_blk)
 
 /**********************************************************************
   2-way camellia
@@ -424,7 +424,7 @@ ENDPROC(camellia_dec_blk)
 		bswapq				RAB1; \
 		movq RAB1,			12*2(RIO);
 
-ENTRY(__camellia_enc_blk_2way)
+SYM_FUNC_START(__camellia_enc_blk_2way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -471,9 +471,9 @@ ENTRY(__camellia_enc_blk_2way)
 	movq RRBP, %rbp;
 	popq %rbx;
 	ret;
-ENDPROC(__camellia_enc_blk_2way)
+SYM_FUNC_END(__camellia_enc_blk_2way)
 
-ENTRY(camellia_dec_blk_2way)
+SYM_FUNC_START(camellia_dec_blk_2way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -511,4 +511,4 @@ ENTRY(camellia_dec_blk_2way)
 	movq RRBP, %rbp;
 	movq RXOR, %rbx;
 	ret;
-ENDPROC(camellia_dec_blk_2way)
+SYM_FUNC_END(camellia_dec_blk_2way)
diff --git a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
index 0fe153a87d90..bacf6c989d10 100644
--- a/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast5-avx-x86_64-asm_64.S
@@ -370,7 +370,7 @@ SYM_FUNC_START_LOCAL(__cast5_dec_blk16)
 	jmp .L__dec_tail;
 SYM_FUNC_END(__cast5_dec_blk16)
 
-ENTRY(cast5_ecb_enc_16way)
+SYM_FUNC_START(cast5_ecb_enc_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -402,9 +402,9 @@ ENTRY(cast5_ecb_enc_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(cast5_ecb_enc_16way)
+SYM_FUNC_END(cast5_ecb_enc_16way)
 
-ENTRY(cast5_ecb_dec_16way)
+SYM_FUNC_START(cast5_ecb_dec_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -436,9 +436,9 @@ ENTRY(cast5_ecb_dec_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(cast5_ecb_dec_16way)
+SYM_FUNC_END(cast5_ecb_dec_16way)
 
-ENTRY(cast5_cbc_dec_16way)
+SYM_FUNC_START(cast5_cbc_dec_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -487,9 +487,9 @@ ENTRY(cast5_cbc_dec_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(cast5_cbc_dec_16way)
+SYM_FUNC_END(cast5_cbc_dec_16way)
 
-ENTRY(cast5_ctr_16way)
+SYM_FUNC_START(cast5_ctr_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -562,4 +562,4 @@ ENTRY(cast5_ctr_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(cast5_ctr_16way)
+SYM_FUNC_END(cast5_ctr_16way)
diff --git a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
index 0d71989fff90..cfb688a40b03 100644
--- a/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/cast6-avx-x86_64-asm_64.S
@@ -352,7 +352,7 @@ SYM_FUNC_START_LOCAL(__cast6_dec_blk8)
 	ret;
 SYM_FUNC_END(__cast6_dec_blk8)
 
-ENTRY(cast6_ecb_enc_8way)
+SYM_FUNC_START(cast6_ecb_enc_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -370,9 +370,9 @@ ENTRY(cast6_ecb_enc_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(cast6_ecb_enc_8way)
+SYM_FUNC_END(cast6_ecb_enc_8way)
 
-ENTRY(cast6_ecb_dec_8way)
+SYM_FUNC_START(cast6_ecb_dec_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -390,9 +390,9 @@ ENTRY(cast6_ecb_dec_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(cast6_ecb_dec_8way)
+SYM_FUNC_END(cast6_ecb_dec_8way)
 
-ENTRY(cast6_cbc_dec_8way)
+SYM_FUNC_START(cast6_cbc_dec_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -415,9 +415,9 @@ ENTRY(cast6_cbc_dec_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(cast6_cbc_dec_8way)
+SYM_FUNC_END(cast6_cbc_dec_8way)
 
-ENTRY(cast6_ctr_8way)
+SYM_FUNC_START(cast6_ctr_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -442,9 +442,9 @@ ENTRY(cast6_ctr_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(cast6_ctr_8way)
+SYM_FUNC_END(cast6_ctr_8way)
 
-ENTRY(cast6_xts_enc_8way)
+SYM_FUNC_START(cast6_xts_enc_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -466,9 +466,9 @@ ENTRY(cast6_xts_enc_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(cast6_xts_enc_8way)
+SYM_FUNC_END(cast6_xts_enc_8way)
 
-ENTRY(cast6_xts_dec_8way)
+SYM_FUNC_START(cast6_xts_dec_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -490,4 +490,4 @@ ENTRY(cast6_xts_dec_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(cast6_xts_dec_8way)
+SYM_FUNC_END(cast6_xts_dec_8way)
diff --git a/arch/x86/crypto/chacha20-avx2-x86_64.S b/arch/x86/crypto/chacha20-avx2-x86_64.S
index 3a2dc3dc6cac..f06f181c572a 100644
--- a/arch/x86/crypto/chacha20-avx2-x86_64.S
+++ b/arch/x86/crypto/chacha20-avx2-x86_64.S
@@ -28,7 +28,7 @@ CTRINC:	.octa 0x00000003000000020000000100000000
 
 .text
 
-ENTRY(chacha20_8block_xor_avx2)
+SYM_FUNC_START(chacha20_8block_xor_avx2)
 	# %rdi: Input state matrix, s
 	# %rsi: 8 data blocks output, o
 	# %rdx: 8 data blocks input, i
@@ -445,4 +445,4 @@ ENTRY(chacha20_8block_xor_avx2)
 	vzeroupper
 	mov		%r8,%rsp
 	ret
-ENDPROC(chacha20_8block_xor_avx2)
+SYM_FUNC_END(chacha20_8block_xor_avx2)
diff --git a/arch/x86/crypto/chacha20-ssse3-x86_64.S b/arch/x86/crypto/chacha20-ssse3-x86_64.S
index 3f511a7d73b8..1e8b93bd2d93 100644
--- a/arch/x86/crypto/chacha20-ssse3-x86_64.S
+++ b/arch/x86/crypto/chacha20-ssse3-x86_64.S
@@ -23,7 +23,7 @@ CTRINC:	.octa 0x00000003000000020000000100000000
 
 .text
 
-ENTRY(chacha20_block_xor_ssse3)
+SYM_FUNC_START(chacha20_block_xor_ssse3)
 	# %rdi: Input state matrix, s
 	# %rsi: 1 data block output, o
 	# %rdx: 1 data block input, i
@@ -143,9 +143,9 @@ ENTRY(chacha20_block_xor_ssse3)
 	movdqu		%xmm3,0x30(%rsi)
 
 	ret
-ENDPROC(chacha20_block_xor_ssse3)
+SYM_FUNC_END(chacha20_block_xor_ssse3)
 
-ENTRY(chacha20_4block_xor_ssse3)
+SYM_FUNC_START(chacha20_4block_xor_ssse3)
 	# %rdi: Input state matrix, s
 	# %rsi: 4 data blocks output, o
 	# %rdx: 4 data blocks input, i
@@ -627,4 +627,4 @@ ENTRY(chacha20_4block_xor_ssse3)
 
 	mov		%r11,%rsp
 	ret
-ENDPROC(chacha20_4block_xor_ssse3)
+SYM_FUNC_END(chacha20_4block_xor_ssse3)
diff --git a/arch/x86/crypto/crc32-pclmul_asm.S b/arch/x86/crypto/crc32-pclmul_asm.S
index f247304299a2..690d429ef5e7 100644
--- a/arch/x86/crypto/crc32-pclmul_asm.S
+++ b/arch/x86/crypto/crc32-pclmul_asm.S
@@ -102,7 +102,7 @@
  *	                     size_t len, uint crc32)
  */
 
-ENTRY(crc32_pclmul_le_16) /* buffer and buffer size are 16 bytes aligned */
+SYM_FUNC_START(crc32_pclmul_le_16) /* buffer and buffer size are 16 bytes aligned */
 	movdqa  (BUF), %xmm1
 	movdqa  0x10(BUF), %xmm2
 	movdqa  0x20(BUF), %xmm3
@@ -243,4 +243,4 @@ fold_64:
 	PEXTRD  0x01, %xmm1, %eax
 
 	ret
-ENDPROC(crc32_pclmul_le_16)
+SYM_FUNC_END(crc32_pclmul_le_16)
diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
index 7a7de27c6f41..344ec8d9670b 100644
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -73,7 +73,7 @@
 # unsigned int crc_pcl(u8 *buffer, int len, unsigned int crc_init);
 
 .text
-ENTRY(crc_pcl)
+SYM_FUNC_START(crc_pcl)
 #define    bufp		%rdi
 #define    bufp_dw	%edi
 #define    bufp_w	%di
@@ -310,7 +310,7 @@ do_return:
 	popq    %rdi
 	popq    %rbx
         ret
-ENDPROC(crc_pcl)
+SYM_FUNC_END(crc_pcl)
 
 .section	.rodata, "a", @progbits
         ################################################################
diff --git a/arch/x86/crypto/crct10dif-pcl-asm_64.S b/arch/x86/crypto/crct10dif-pcl-asm_64.S
index de04d3e98d8d..f56b499541e0 100644
--- a/arch/x86/crypto/crct10dif-pcl-asm_64.S
+++ b/arch/x86/crypto/crct10dif-pcl-asm_64.S
@@ -68,7 +68,7 @@
 
 #define        arg1_low32 %edi
 
-ENTRY(crc_t10dif_pcl)
+SYM_FUNC_START(crc_t10dif_pcl)
 .align 16
 
 	# adjust the 16-bit initial_crc value, scale it to 32 bits
@@ -552,7 +552,7 @@ _only_less_than_2:
 
 	jmp	_barrett
 
-ENDPROC(crc_t10dif_pcl)
+SYM_FUNC_END(crc_t10dif_pcl)
 
 .section	.rodata, "a", @progbits
 .align 16
diff --git a/arch/x86/crypto/des3_ede-asm_64.S b/arch/x86/crypto/des3_ede-asm_64.S
index f3e91647ca27..0c39ed072173 100644
--- a/arch/x86/crypto/des3_ede-asm_64.S
+++ b/arch/x86/crypto/des3_ede-asm_64.S
@@ -171,7 +171,7 @@
 	movl   left##d,   (io); \
 	movl   right##d, 4(io);
 
-ENTRY(des3_ede_x86_64_crypt_blk)
+SYM_FUNC_START(des3_ede_x86_64_crypt_blk)
 	/* input:
 	 *	%rdi: round keys, CTX
 	 *	%rsi: dst
@@ -251,7 +251,7 @@ ENTRY(des3_ede_x86_64_crypt_blk)
 	popq %rbp;
 
 	ret;
-ENDPROC(des3_ede_x86_64_crypt_blk)
+SYM_FUNC_END(des3_ede_x86_64_crypt_blk)
 
 /***********************************************************************
  * 3-way 3DES
@@ -425,7 +425,7 @@ ENDPROC(des3_ede_x86_64_crypt_blk)
 #define __movq(src, dst) \
 	movq src, dst;
 
-ENTRY(des3_ede_x86_64_crypt_blk_3way)
+SYM_FUNC_START(des3_ede_x86_64_crypt_blk_3way)
 	/* input:
 	 *	%rdi: ctx, round keys
 	 *	%rsi: dst (3 blocks)
@@ -535,7 +535,7 @@ ENTRY(des3_ede_x86_64_crypt_blk_3way)
 	popq %rbp;
 
 	ret;
-ENDPROC(des3_ede_x86_64_crypt_blk_3way)
+SYM_FUNC_END(des3_ede_x86_64_crypt_blk_3way)
 
 .section	.rodata, "a", @progbits
 .align 16
diff --git a/arch/x86/crypto/ghash-clmulni-intel_asm.S b/arch/x86/crypto/ghash-clmulni-intel_asm.S
index c3db86842578..12e3a850257b 100644
--- a/arch/x86/crypto/ghash-clmulni-intel_asm.S
+++ b/arch/x86/crypto/ghash-clmulni-intel_asm.S
@@ -93,7 +93,7 @@ SYM_FUNC_START_LOCAL(__clmul_gf128mul_ble)
 SYM_FUNC_END(__clmul_gf128mul_ble)
 
 /* void clmul_ghash_mul(char *dst, const u128 *shash) */
-ENTRY(clmul_ghash_mul)
+SYM_FUNC_START(clmul_ghash_mul)
 	FRAME_BEGIN
 	movups (%rdi), DATA
 	movups (%rsi), SHASH
@@ -104,13 +104,13 @@ ENTRY(clmul_ghash_mul)
 	movups DATA, (%rdi)
 	FRAME_END
 	ret
-ENDPROC(clmul_ghash_mul)
+SYM_FUNC_END(clmul_ghash_mul)
 
 /*
  * void clmul_ghash_update(char *dst, const char *src, unsigned int srclen,
  *			   const u128 *shash);
  */
-ENTRY(clmul_ghash_update)
+SYM_FUNC_START(clmul_ghash_update)
 	FRAME_BEGIN
 	cmp $16, %rdx
 	jb .Lupdate_just_ret	# check length
@@ -133,4 +133,4 @@ ENTRY(clmul_ghash_update)
 .Lupdate_just_ret:
 	FRAME_END
 	ret
-ENDPROC(clmul_ghash_update)
+SYM_FUNC_END(clmul_ghash_update)
diff --git a/arch/x86/crypto/poly1305-avx2-x86_64.S b/arch/x86/crypto/poly1305-avx2-x86_64.S
index 3b6e70d085da..68b0f4386dc4 100644
--- a/arch/x86/crypto/poly1305-avx2-x86_64.S
+++ b/arch/x86/crypto/poly1305-avx2-x86_64.S
@@ -83,7 +83,7 @@ ORMASK:	.octa 0x00000000010000000000000001000000
 #define d3 %r12
 #define d4 %r13
 
-ENTRY(poly1305_4block_avx2)
+SYM_FUNC_START(poly1305_4block_avx2)
 	# %rdi: Accumulator h[5]
 	# %rsi: 64 byte input block m
 	# %rdx: Poly1305 key r[5]
@@ -385,4 +385,4 @@ ENTRY(poly1305_4block_avx2)
 	pop		%r12
 	pop		%rbx
 	ret
-ENDPROC(poly1305_4block_avx2)
+SYM_FUNC_END(poly1305_4block_avx2)
diff --git a/arch/x86/crypto/poly1305-sse2-x86_64.S b/arch/x86/crypto/poly1305-sse2-x86_64.S
index c88c670cb5fc..66715fbedc18 100644
--- a/arch/x86/crypto/poly1305-sse2-x86_64.S
+++ b/arch/x86/crypto/poly1305-sse2-x86_64.S
@@ -50,7 +50,7 @@ ORMASK:	.octa 0x00000000010000000000000001000000
 #define d3 %r11
 #define d4 %r12
 
-ENTRY(poly1305_block_sse2)
+SYM_FUNC_START(poly1305_block_sse2)
 	# %rdi: Accumulator h[5]
 	# %rsi: 16 byte input block m
 	# %rdx: Poly1305 key r[5]
@@ -276,7 +276,7 @@ ENTRY(poly1305_block_sse2)
 	pop		%r12
 	pop		%rbx
 	ret
-ENDPROC(poly1305_block_sse2)
+SYM_FUNC_END(poly1305_block_sse2)
 
 
 #define u0 0x00(%r8)
@@ -301,7 +301,7 @@ ENDPROC(poly1305_block_sse2)
 #undef d0
 #define d0 %r13
 
-ENTRY(poly1305_2block_sse2)
+SYM_FUNC_START(poly1305_2block_sse2)
 	# %rdi: Accumulator h[5]
 	# %rsi: 16 byte input block m
 	# %rdx: Poly1305 key r[5]
@@ -581,4 +581,4 @@ ENTRY(poly1305_2block_sse2)
 	pop		%r12
 	pop		%rbx
 	ret
-ENDPROC(poly1305_2block_sse2)
+SYM_FUNC_END(poly1305_2block_sse2)
diff --git a/arch/x86/crypto/salsa20-x86_64-asm_64.S b/arch/x86/crypto/salsa20-x86_64-asm_64.S
index 9279e0b2d60e..a5f3dd15a755 100644
--- a/arch/x86/crypto/salsa20-x86_64-asm_64.S
+++ b/arch/x86/crypto/salsa20-x86_64-asm_64.S
@@ -1,7 +1,7 @@
 #include <linux/linkage.h>
 
 # enter salsa20_encrypt_bytes
-ENTRY(salsa20_encrypt_bytes)
+SYM_FUNC_START(salsa20_encrypt_bytes)
 	mov	%rsp,%r11
 	and	$31,%r11
 	add	$256,%r11
@@ -801,10 +801,10 @@ ENTRY(salsa20_encrypt_bytes)
 	# comment:fp stack unchanged by jump
 	# goto bytesatleast1
 	jmp	._bytesatleast1
-ENDPROC(salsa20_encrypt_bytes)
+SYM_FUNC_END(salsa20_encrypt_bytes)
 
 # enter salsa20_keysetup
-ENTRY(salsa20_keysetup)
+SYM_FUNC_START(salsa20_keysetup)
 	mov	%rsp,%r11
 	and	$31,%r11
 	add	$256,%r11
@@ -890,10 +890,10 @@ ENTRY(salsa20_keysetup)
 	mov	%rdi,%rax
 	mov	%rsi,%rdx
 	ret
-ENDPROC(salsa20_keysetup)
+SYM_FUNC_END(salsa20_keysetup)
 
 # enter salsa20_ivsetup
-ENTRY(salsa20_ivsetup)
+SYM_FUNC_START(salsa20_ivsetup)
 	mov	%rsp,%r11
 	and	$31,%r11
 	add	$256,%r11
@@ -915,4 +915,4 @@ ENTRY(salsa20_ivsetup)
 	mov	%rdi,%rax
 	mov	%rsi,%rdx
 	ret
-ENDPROC(salsa20_ivsetup)
+SYM_FUNC_END(salsa20_ivsetup)
diff --git a/arch/x86/crypto/serpent-avx-x86_64-asm_64.S b/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
index c2d4a1fc9ee8..72de86a8091e 100644
--- a/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/serpent-avx-x86_64-asm_64.S
@@ -677,7 +677,7 @@ SYM_FUNC_START_LOCAL(__serpent_dec_blk8_avx)
 	ret;
 SYM_FUNC_END(__serpent_dec_blk8_avx)
 
-ENTRY(serpent_ecb_enc_8way_avx)
+SYM_FUNC_START(serpent_ecb_enc_8way_avx)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -693,9 +693,9 @@ ENTRY(serpent_ecb_enc_8way_avx)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_ecb_enc_8way_avx)
+SYM_FUNC_END(serpent_ecb_enc_8way_avx)
 
-ENTRY(serpent_ecb_dec_8way_avx)
+SYM_FUNC_START(serpent_ecb_dec_8way_avx)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -711,9 +711,9 @@ ENTRY(serpent_ecb_dec_8way_avx)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_ecb_dec_8way_avx)
+SYM_FUNC_END(serpent_ecb_dec_8way_avx)
 
-ENTRY(serpent_cbc_dec_8way_avx)
+SYM_FUNC_START(serpent_cbc_dec_8way_avx)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -729,9 +729,9 @@ ENTRY(serpent_cbc_dec_8way_avx)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_cbc_dec_8way_avx)
+SYM_FUNC_END(serpent_cbc_dec_8way_avx)
 
-ENTRY(serpent_ctr_8way_avx)
+SYM_FUNC_START(serpent_ctr_8way_avx)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -749,9 +749,9 @@ ENTRY(serpent_ctr_8way_avx)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_ctr_8way_avx)
+SYM_FUNC_END(serpent_ctr_8way_avx)
 
-ENTRY(serpent_xts_enc_8way_avx)
+SYM_FUNC_START(serpent_xts_enc_8way_avx)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -771,9 +771,9 @@ ENTRY(serpent_xts_enc_8way_avx)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_xts_enc_8way_avx)
+SYM_FUNC_END(serpent_xts_enc_8way_avx)
 
-ENTRY(serpent_xts_dec_8way_avx)
+SYM_FUNC_START(serpent_xts_dec_8way_avx)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -793,4 +793,4 @@ ENTRY(serpent_xts_dec_8way_avx)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_xts_dec_8way_avx)
+SYM_FUNC_END(serpent_xts_dec_8way_avx)
diff --git a/arch/x86/crypto/serpent-avx2-asm_64.S b/arch/x86/crypto/serpent-avx2-asm_64.S
index 52c527ce4b18..b866f1632803 100644
--- a/arch/x86/crypto/serpent-avx2-asm_64.S
+++ b/arch/x86/crypto/serpent-avx2-asm_64.S
@@ -673,7 +673,7 @@ SYM_FUNC_START_LOCAL(__serpent_dec_blk16)
 	ret;
 SYM_FUNC_END(__serpent_dec_blk16)
 
-ENTRY(serpent_ecb_enc_16way)
+SYM_FUNC_START(serpent_ecb_enc_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -693,9 +693,9 @@ ENTRY(serpent_ecb_enc_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_ecb_enc_16way)
+SYM_FUNC_END(serpent_ecb_enc_16way)
 
-ENTRY(serpent_ecb_dec_16way)
+SYM_FUNC_START(serpent_ecb_dec_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -715,9 +715,9 @@ ENTRY(serpent_ecb_dec_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_ecb_dec_16way)
+SYM_FUNC_END(serpent_ecb_dec_16way)
 
-ENTRY(serpent_cbc_dec_16way)
+SYM_FUNC_START(serpent_cbc_dec_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -738,9 +738,9 @@ ENTRY(serpent_cbc_dec_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_cbc_dec_16way)
+SYM_FUNC_END(serpent_cbc_dec_16way)
 
-ENTRY(serpent_ctr_16way)
+SYM_FUNC_START(serpent_ctr_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (16 blocks)
@@ -763,9 +763,9 @@ ENTRY(serpent_ctr_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_ctr_16way)
+SYM_FUNC_END(serpent_ctr_16way)
 
-ENTRY(serpent_xts_enc_16way)
+SYM_FUNC_START(serpent_xts_enc_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (16 blocks)
@@ -789,9 +789,9 @@ ENTRY(serpent_xts_enc_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_xts_enc_16way)
+SYM_FUNC_END(serpent_xts_enc_16way)
 
-ENTRY(serpent_xts_dec_16way)
+SYM_FUNC_START(serpent_xts_dec_16way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst (16 blocks)
@@ -815,4 +815,4 @@ ENTRY(serpent_xts_dec_16way)
 
 	FRAME_END
 	ret;
-ENDPROC(serpent_xts_dec_16way)
+SYM_FUNC_END(serpent_xts_dec_16way)
diff --git a/arch/x86/crypto/serpent-sse2-x86_64-asm_64.S b/arch/x86/crypto/serpent-sse2-x86_64-asm_64.S
index acc066c7c6b2..bdeee900df63 100644
--- a/arch/x86/crypto/serpent-sse2-x86_64-asm_64.S
+++ b/arch/x86/crypto/serpent-sse2-x86_64-asm_64.S
@@ -634,7 +634,7 @@
 	pxor t0,		x3; \
 	movdqu x3,		(3*4*4)(out);
 
-ENTRY(__serpent_enc_blk_8way)
+SYM_FUNC_START(__serpent_enc_blk_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -697,9 +697,9 @@ ENTRY(__serpent_enc_blk_8way)
 	xor_blocks(%rax, RA2, RB2, RC2, RD2, RK0, RK1, RK2);
 
 	ret;
-ENDPROC(__serpent_enc_blk_8way)
+SYM_FUNC_END(__serpent_enc_blk_8way)
 
-ENTRY(serpent_dec_blk_8way)
+SYM_FUNC_START(serpent_dec_blk_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -751,4 +751,4 @@ ENTRY(serpent_dec_blk_8way)
 	write_blocks(%rax, RC2, RD2, RB2, RE2, RK0, RK1, RK2);
 
 	ret;
-ENDPROC(serpent_dec_blk_8way)
+SYM_FUNC_END(serpent_dec_blk_8way)
diff --git a/arch/x86/crypto/sha1-mb/sha1_mb_mgr_flush_avx2.S b/arch/x86/crypto/sha1-mb/sha1_mb_mgr_flush_avx2.S
index 93b945597ecf..7623a16c3c5f 100644
--- a/arch/x86/crypto/sha1-mb/sha1_mb_mgr_flush_avx2.S
+++ b/arch/x86/crypto/sha1-mb/sha1_mb_mgr_flush_avx2.S
@@ -103,7 +103,7 @@ offset = \_offset
 
 # JOB* sha1_mb_mgr_flush_avx2(MB_MGR *state)
 # arg 1 : rcx : state
-ENTRY(sha1_mb_mgr_flush_avx2)
+SYM_FUNC_START(sha1_mb_mgr_flush_avx2)
 	FRAME_BEGIN
 	push	%rbx
 
@@ -220,13 +220,13 @@ return:
 return_null:
 	xor     job_rax, job_rax
 	jmp     return
-ENDPROC(sha1_mb_mgr_flush_avx2)
+SYM_FUNC_END(sha1_mb_mgr_flush_avx2)
 
 
 #################################################################
 
 .align 16
-ENTRY(sha1_mb_mgr_get_comp_job_avx2)
+SYM_FUNC_START(sha1_mb_mgr_get_comp_job_avx2)
 	push    %rbx
 
 	## if bit 32+3 is set, then all lanes are empty
@@ -279,7 +279,7 @@ ENTRY(sha1_mb_mgr_get_comp_job_avx2)
 	xor     job_rax, job_rax
 	pop     %rbx
 	ret
-ENDPROC(sha1_mb_mgr_get_comp_job_avx2)
+SYM_FUNC_END(sha1_mb_mgr_get_comp_job_avx2)
 
 .section	.rodata.cst16.clear_low_nibble, "aM", @progbits, 16
 .align 16
diff --git a/arch/x86/crypto/sha1-mb/sha1_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha1-mb/sha1_mb_mgr_submit_avx2.S
index 7a93b1c0d69a..a46e3b04385e 100644
--- a/arch/x86/crypto/sha1-mb/sha1_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha1-mb/sha1_mb_mgr_submit_avx2.S
@@ -98,7 +98,7 @@ lane_data       = %r10
 # JOB* submit_mb_mgr_submit_avx2(MB_MGR *state, job_sha1 *job)
 # arg 1 : rcx : state
 # arg 2 : rdx : job
-ENTRY(sha1_mb_mgr_submit_avx2)
+SYM_FUNC_START(sha1_mb_mgr_submit_avx2)
 	FRAME_BEGIN
 	push	%rbx
 	push	%r12
@@ -201,7 +201,7 @@ return_null:
 	xor     job_rax, job_rax
 	jmp     return
 
-ENDPROC(sha1_mb_mgr_submit_avx2)
+SYM_FUNC_END(sha1_mb_mgr_submit_avx2)
 
 .section	.rodata.cst16.clear_low_nibble, "aM", @progbits, 16
 .align 16
diff --git a/arch/x86/crypto/sha1-mb/sha1_x8_avx2.S b/arch/x86/crypto/sha1-mb/sha1_x8_avx2.S
index 20f77aa633de..04d763520a82 100644
--- a/arch/x86/crypto/sha1-mb/sha1_x8_avx2.S
+++ b/arch/x86/crypto/sha1-mb/sha1_x8_avx2.S
@@ -294,7 +294,7 @@ W14  = TMP_
 # arg 1 : pointer to array[4] of pointer to input data
 # arg 2 : size (in blocks) ;; assumed to be >= 1
 #
-ENTRY(sha1_x8_avx2)
+SYM_FUNC_START(sha1_x8_avx2)
 
 	# save callee-saved clobbered registers to comply with C function ABI
 	push	%r12
@@ -458,7 +458,7 @@ lloop:
 	pop	%r12
 
 	ret
-ENDPROC(sha1_x8_avx2)
+SYM_FUNC_END(sha1_x8_avx2)
 
 
 .section	.rodata.cst32.K00_19, "aM", @progbits, 32
diff --git a/arch/x86/crypto/sha1_avx2_x86_64_asm.S b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
index 1cd792db15ef..4a477cb80fa5 100644
--- a/arch/x86/crypto/sha1_avx2_x86_64_asm.S
+++ b/arch/x86/crypto/sha1_avx2_x86_64_asm.S
@@ -622,7 +622,7 @@ _loop3:
  * param: function's name
  */
 .macro SHA1_VECTOR_ASM  name
-	ENTRY(\name)
+	SYM_FUNC_START(\name)
 
 	push	%rbx
 	push	%rbp
@@ -673,7 +673,7 @@ _loop3:
 
 	ret
 
-	ENDPROC(\name)
+	SYM_FUNC_END(\name)
 .endm
 
 .section .rodata
diff --git a/arch/x86/crypto/sha1_ni_asm.S b/arch/x86/crypto/sha1_ni_asm.S
index ebbdba72ae07..11efe3a45a1f 100644
--- a/arch/x86/crypto/sha1_ni_asm.S
+++ b/arch/x86/crypto/sha1_ni_asm.S
@@ -95,7 +95,7 @@
  */
 .text
 .align 32
-ENTRY(sha1_ni_transform)
+SYM_FUNC_START(sha1_ni_transform)
 	mov		%rsp, RSPSAVE
 	sub		$FRAME_SIZE, %rsp
 	and		$~0xF, %rsp
@@ -291,7 +291,7 @@ ENTRY(sha1_ni_transform)
 	mov		RSPSAVE, %rsp
 
 	ret
-ENDPROC(sha1_ni_transform)
+SYM_FUNC_END(sha1_ni_transform)
 
 .section	.rodata.cst16.PSHUFFLE_BYTE_FLIP_MASK, "aM", @progbits, 16
 .align 16
diff --git a/arch/x86/crypto/sha1_ssse3_asm.S b/arch/x86/crypto/sha1_ssse3_asm.S
index a4109506a5e8..5a5b433272e5 100644
--- a/arch/x86/crypto/sha1_ssse3_asm.S
+++ b/arch/x86/crypto/sha1_ssse3_asm.S
@@ -71,7 +71,7 @@
  * param: function's name
  */
 .macro SHA1_VECTOR_ASM  name
-	ENTRY(\name)
+	SYM_FUNC_START(\name)
 
 	push	%rbx
 	push	%rbp
@@ -106,7 +106,7 @@
 	pop	%rbx
 	ret
 
-	ENDPROC(\name)
+	SYM_FUNC_END(\name)
 .endm
 
 /*
diff --git a/arch/x86/crypto/sha256-avx-asm.S b/arch/x86/crypto/sha256-avx-asm.S
index e08888a1a5f2..b3208cd5aee9 100644
--- a/arch/x86/crypto/sha256-avx-asm.S
+++ b/arch/x86/crypto/sha256-avx-asm.S
@@ -347,7 +347,7 @@ a = TMP_
 ## arg 3 : Num blocks
 ########################################################################
 .text
-ENTRY(sha256_transform_avx)
+SYM_FUNC_START(sha256_transform_avx)
 .align 32
 	pushq   %rbx
 	pushq   %rbp
@@ -461,7 +461,7 @@ done_hash:
 	popq    %rbp
 	popq    %rbx
 	ret
-ENDPROC(sha256_transform_avx)
+SYM_FUNC_END(sha256_transform_avx)
 
 .section	.rodata.cst256.K256, "aM", @progbits, 256
 .align 64
diff --git a/arch/x86/crypto/sha256-avx2-asm.S b/arch/x86/crypto/sha256-avx2-asm.S
index 89c8f09787d2..63d204bc1148 100644
--- a/arch/x86/crypto/sha256-avx2-asm.S
+++ b/arch/x86/crypto/sha256-avx2-asm.S
@@ -528,7 +528,7 @@ STACK_SIZE	= _RSP      + _RSP_SIZE
 ## arg 3 : Num blocks
 ########################################################################
 .text
-ENTRY(sha256_transform_rorx)
+SYM_FUNC_START(sha256_transform_rorx)
 .align 32
 	pushq	%rbx
 	pushq	%rbp
@@ -721,7 +721,7 @@ done_hash:
 	popq	%rbp
 	popq	%rbx
 	ret
-ENDPROC(sha256_transform_rorx)
+SYM_FUNC_END(sha256_transform_rorx)
 
 .section	.rodata.cst512.K256, "aM", @progbits, 512
 .align 64
diff --git a/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S b/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
index 8fe6338bcc84..9681227be16c 100644
--- a/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
+++ b/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
@@ -101,7 +101,7 @@ offset = \_offset
 
 # JOB_SHA256* sha256_mb_mgr_flush_avx2(MB_MGR *state)
 # arg 1 : rcx : state
-ENTRY(sha256_mb_mgr_flush_avx2)
+SYM_FUNC_START(sha256_mb_mgr_flush_avx2)
 	FRAME_BEGIN
         push    %rbx
 
@@ -220,12 +220,12 @@ return:
 return_null:
 	xor	job_rax, job_rax
 	jmp	return
-ENDPROC(sha256_mb_mgr_flush_avx2)
+SYM_FUNC_END(sha256_mb_mgr_flush_avx2)
 
 ##############################################################################
 
 .align 16
-ENTRY(sha256_mb_mgr_get_comp_job_avx2)
+SYM_FUNC_START(sha256_mb_mgr_get_comp_job_avx2)
 	push	%rbx
 
 	## if bit 32+3 is set, then all lanes are empty
@@ -282,7 +282,7 @@ ENTRY(sha256_mb_mgr_get_comp_job_avx2)
 	xor	job_rax, job_rax
 	pop	%rbx
 	ret
-ENDPROC(sha256_mb_mgr_get_comp_job_avx2)
+SYM_FUNC_END(sha256_mb_mgr_get_comp_job_avx2)
 
 .section	.rodata.cst16.clear_low_nibble, "aM", @progbits, 16
 .align 16
diff --git a/arch/x86/crypto/sha256-mb/sha256_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha256-mb/sha256_mb_mgr_submit_avx2.S
index b36ae7454084..2213c04a30dc 100644
--- a/arch/x86/crypto/sha256-mb/sha256_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha256-mb/sha256_mb_mgr_submit_avx2.S
@@ -96,7 +96,7 @@ lane_data	= %r10
 # JOB* sha256_mb_mgr_submit_avx2(MB_MGR *state, JOB_SHA256 *job)
 # arg 1 : rcx : state
 # arg 2 : rdx : job
-ENTRY(sha256_mb_mgr_submit_avx2)
+SYM_FUNC_START(sha256_mb_mgr_submit_avx2)
 	FRAME_BEGIN
 	push	%rbx
 	push	%r12
@@ -206,7 +206,7 @@ return_null:
 	xor	job_rax, job_rax
 	jmp	return
 
-ENDPROC(sha256_mb_mgr_submit_avx2)
+SYM_FUNC_END(sha256_mb_mgr_submit_avx2)
 
 .section	.rodata.cst16.clear_low_nibble, "aM", @progbits, 16
 .align 16
diff --git a/arch/x86/crypto/sha256-mb/sha256_x8_avx2.S b/arch/x86/crypto/sha256-mb/sha256_x8_avx2.S
index 1687c80c5995..042d2381f435 100644
--- a/arch/x86/crypto/sha256-mb/sha256_x8_avx2.S
+++ b/arch/x86/crypto/sha256-mb/sha256_x8_avx2.S
@@ -280,7 +280,7 @@ a = TMP_
 	# general registers preserved in outer calling routine
 	# outer calling routine saves all the XMM registers
 	# save rsp, allocate 32-byte aligned for local variables
-ENTRY(sha256_x8_avx2)
+SYM_FUNC_START(sha256_x8_avx2)
 
 	# save callee-saved clobbered registers to comply with C function ABI
 	push    %r12
@@ -436,7 +436,7 @@ Lrounds_16_xx:
 	pop     %r12
 
 	ret
-ENDPROC(sha256_x8_avx2)
+SYM_FUNC_END(sha256_x8_avx2)
 
 .section	.rodata.K256_8, "a", @progbits
 .align 64
diff --git a/arch/x86/crypto/sha256-ssse3-asm.S b/arch/x86/crypto/sha256-ssse3-asm.S
index 39b83c93e7fd..281633643bb7 100644
--- a/arch/x86/crypto/sha256-ssse3-asm.S
+++ b/arch/x86/crypto/sha256-ssse3-asm.S
@@ -353,7 +353,7 @@ a = TMP_
 ## arg 3 : Num blocks
 ########################################################################
 .text
-ENTRY(sha256_transform_ssse3)
+SYM_FUNC_START(sha256_transform_ssse3)
 .align 32
 	pushq   %rbx
 	pushq   %rbp
@@ -472,7 +472,7 @@ done_hash:
 	popq    %rbx
 
 	ret
-ENDPROC(sha256_transform_ssse3)
+SYM_FUNC_END(sha256_transform_ssse3)
 
 .section	.rodata.cst256.K256, "aM", @progbits, 256
 .align 64
diff --git a/arch/x86/crypto/sha256_ni_asm.S b/arch/x86/crypto/sha256_ni_asm.S
index fb58f58ecfbc..7abade04a3a3 100644
--- a/arch/x86/crypto/sha256_ni_asm.S
+++ b/arch/x86/crypto/sha256_ni_asm.S
@@ -97,7 +97,7 @@
 
 .text
 .align 32
-ENTRY(sha256_ni_transform)
+SYM_FUNC_START(sha256_ni_transform)
 
 	shl		$6, NUM_BLKS		/*  convert to bytes */
 	jz		.Ldone_hash
@@ -327,7 +327,7 @@ ENTRY(sha256_ni_transform)
 .Ldone_hash:
 
 	ret
-ENDPROC(sha256_ni_transform)
+SYM_FUNC_END(sha256_ni_transform)
 
 .section	.rodata.cst256.K256, "aM", @progbits, 256
 .align 64
diff --git a/arch/x86/crypto/sha512-avx-asm.S b/arch/x86/crypto/sha512-avx-asm.S
index 39235fefe6f7..3704ddd7e5d5 100644
--- a/arch/x86/crypto/sha512-avx-asm.S
+++ b/arch/x86/crypto/sha512-avx-asm.S
@@ -277,7 +277,7 @@ frame_size = frame_GPRSAVE + GPRSAVE_SIZE
 # message blocks.
 # L is the message length in SHA512 blocks
 ########################################################################
-ENTRY(sha512_transform_avx)
+SYM_FUNC_START(sha512_transform_avx)
 	cmp $0, msglen
 	je nowork
 
@@ -365,7 +365,7 @@ updateblock:
 
 nowork:
 	ret
-ENDPROC(sha512_transform_avx)
+SYM_FUNC_END(sha512_transform_avx)
 
 ########################################################################
 ### Binary Data
diff --git a/arch/x86/crypto/sha512-avx2-asm.S b/arch/x86/crypto/sha512-avx2-asm.S
index 7f5f6c6ec72e..86cca90edf64 100644
--- a/arch/x86/crypto/sha512-avx2-asm.S
+++ b/arch/x86/crypto/sha512-avx2-asm.S
@@ -568,7 +568,7 @@ frame_size = frame_GPRSAVE + GPRSAVE_SIZE
 #   message blocks.
 # L is the message length in SHA512 blocks
 ########################################################################
-ENTRY(sha512_transform_rorx)
+SYM_FUNC_START(sha512_transform_rorx)
 	# Allocate Stack Space
 	mov	%rsp, %rax
 	sub	$frame_size, %rsp
@@ -679,7 +679,7 @@ done_hash:
 	# Restore Stack Pointer
 	mov	frame_RSPSAVE(%rsp), %rsp
 	ret
-ENDPROC(sha512_transform_rorx)
+SYM_FUNC_END(sha512_transform_rorx)
 
 ########################################################################
 ### Binary Data
diff --git a/arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S b/arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S
index 7c629caebc05..8642f3a04388 100644
--- a/arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S
+++ b/arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S
@@ -107,7 +107,7 @@ offset = \_offset
 
 # JOB* sha512_mb_mgr_flush_avx2(MB_MGR *state)
 # arg 1 : rcx : state
-ENTRY(sha512_mb_mgr_flush_avx2)
+SYM_FUNC_START(sha512_mb_mgr_flush_avx2)
 	FRAME_BEGIN
 	push	%rbx
 
@@ -217,10 +217,10 @@ return:
 return_null:
         xor     job_rax, job_rax
         jmp     return
-ENDPROC(sha512_mb_mgr_flush_avx2)
+SYM_FUNC_END(sha512_mb_mgr_flush_avx2)
 .align 16
 
-ENTRY(sha512_mb_mgr_get_comp_job_avx2)
+SYM_FUNC_START(sha512_mb_mgr_get_comp_job_avx2)
         push    %rbx
 
 	mov     _unused_lanes(state), unused_lanes
@@ -279,7 +279,7 @@ ENTRY(sha512_mb_mgr_get_comp_job_avx2)
         xor     job_rax, job_rax
 	pop     %rbx
         ret
-ENDPROC(sha512_mb_mgr_get_comp_job_avx2)
+SYM_FUNC_END(sha512_mb_mgr_get_comp_job_avx2)
 
 .section	.rodata.cst8.one, "aM", @progbits, 8
 .align 8
diff --git a/arch/x86/crypto/sha512-mb/sha512_mb_mgr_submit_avx2.S b/arch/x86/crypto/sha512-mb/sha512_mb_mgr_submit_avx2.S
index 4ba709ba78e5..62932723d6e9 100644
--- a/arch/x86/crypto/sha512-mb/sha512_mb_mgr_submit_avx2.S
+++ b/arch/x86/crypto/sha512-mb/sha512_mb_mgr_submit_avx2.S
@@ -98,7 +98,7 @@
 # JOB* sha512_mb_mgr_submit_avx2(MB_MGR *state, JOB *job)
 # arg 1 : rcx : state
 # arg 2 : rdx : job
-ENTRY(sha512_mb_mgr_submit_avx2)
+SYM_FUNC_START(sha512_mb_mgr_submit_avx2)
 	FRAME_BEGIN
 	push	%rbx
 	push	%r12
@@ -208,7 +208,7 @@ return:
 return_null:
 	xor     job_rax, job_rax
 	jmp     return
-ENDPROC(sha512_mb_mgr_submit_avx2)
+SYM_FUNC_END(sha512_mb_mgr_submit_avx2)
 
 /* UNUSED?
 .section	.rodata.cst16, "aM", @progbits, 16
diff --git a/arch/x86/crypto/sha512-mb/sha512_x4_avx2.S b/arch/x86/crypto/sha512-mb/sha512_x4_avx2.S
index e22e907643a6..504065d19e03 100644
--- a/arch/x86/crypto/sha512-mb/sha512_x4_avx2.S
+++ b/arch/x86/crypto/sha512-mb/sha512_x4_avx2.S
@@ -239,7 +239,7 @@ a = TMP_
 # void sha512_x4_avx2(void *STATE, const int INP_SIZE)
 # arg 1 : STATE    : pointer to input data
 # arg 2 : INP_SIZE : size of data in blocks (assumed >= 1)
-ENTRY(sha512_x4_avx2)
+SYM_FUNC_START(sha512_x4_avx2)
 	# general registers preserved in outer calling routine
 	# outer calling routine saves all the XMM registers
 	# save callee-saved clobbered registers to comply with C function ABI
@@ -359,7 +359,7 @@ Lrounds_16_xx:
 
 	# outer calling routine restores XMM and other GP registers
 	ret
-ENDPROC(sha512_x4_avx2)
+SYM_FUNC_END(sha512_x4_avx2)
 
 .section	.rodata.K512_4, "a", @progbits
 .align 64
diff --git a/arch/x86/crypto/sha512-ssse3-asm.S b/arch/x86/crypto/sha512-ssse3-asm.S
index 66bbd9058a90..838f984e95d9 100644
--- a/arch/x86/crypto/sha512-ssse3-asm.S
+++ b/arch/x86/crypto/sha512-ssse3-asm.S
@@ -275,7 +275,7 @@ frame_size = frame_GPRSAVE + GPRSAVE_SIZE
 #   message blocks.
 # L is the message length in SHA512 blocks.
 ########################################################################
-ENTRY(sha512_transform_ssse3)
+SYM_FUNC_START(sha512_transform_ssse3)
 
 	cmp $0, msglen
 	je nowork
@@ -364,7 +364,7 @@ updateblock:
 
 nowork:
 	ret
-ENDPROC(sha512_transform_ssse3)
+SYM_FUNC_END(sha512_transform_ssse3)
 
 ########################################################################
 ### Binary Data
diff --git a/arch/x86/crypto/twofish-avx-x86_64-asm_64.S b/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
index 1e87dcde342f..07af53adbc56 100644
--- a/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
+++ b/arch/x86/crypto/twofish-avx-x86_64-asm_64.S
@@ -330,7 +330,7 @@ SYM_FUNC_START_LOCAL(__twofish_dec_blk8)
 	ret;
 SYM_FUNC_END(__twofish_dec_blk8)
 
-ENTRY(twofish_ecb_enc_8way)
+SYM_FUNC_START(twofish_ecb_enc_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -348,9 +348,9 @@ ENTRY(twofish_ecb_enc_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(twofish_ecb_enc_8way)
+SYM_FUNC_END(twofish_ecb_enc_8way)
 
-ENTRY(twofish_ecb_dec_8way)
+SYM_FUNC_START(twofish_ecb_dec_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -368,9 +368,9 @@ ENTRY(twofish_ecb_dec_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(twofish_ecb_dec_8way)
+SYM_FUNC_END(twofish_ecb_dec_8way)
 
-ENTRY(twofish_cbc_dec_8way)
+SYM_FUNC_START(twofish_cbc_dec_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -393,9 +393,9 @@ ENTRY(twofish_cbc_dec_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(twofish_cbc_dec_8way)
+SYM_FUNC_END(twofish_cbc_dec_8way)
 
-ENTRY(twofish_ctr_8way)
+SYM_FUNC_START(twofish_ctr_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -420,9 +420,9 @@ ENTRY(twofish_ctr_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(twofish_ctr_8way)
+SYM_FUNC_END(twofish_ctr_8way)
 
-ENTRY(twofish_xts_enc_8way)
+SYM_FUNC_START(twofish_xts_enc_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -444,9 +444,9 @@ ENTRY(twofish_xts_enc_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(twofish_xts_enc_8way)
+SYM_FUNC_END(twofish_xts_enc_8way)
 
-ENTRY(twofish_xts_dec_8way)
+SYM_FUNC_START(twofish_xts_dec_8way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -468,4 +468,4 @@ ENTRY(twofish_xts_dec_8way)
 
 	FRAME_END
 	ret;
-ENDPROC(twofish_xts_dec_8way)
+SYM_FUNC_END(twofish_xts_dec_8way)
diff --git a/arch/x86/crypto/twofish-x86_64-asm_64-3way.S b/arch/x86/crypto/twofish-x86_64-asm_64-3way.S
index 1c3b7ceb36d2..ece306e35298 100644
--- a/arch/x86/crypto/twofish-x86_64-asm_64-3way.S
+++ b/arch/x86/crypto/twofish-x86_64-asm_64-3way.S
@@ -216,7 +216,7 @@
 	rorq $32,			RAB2; \
 	outunpack3(mov, RIO, 2, RAB, 2);
 
-ENTRY(__twofish_enc_blk_3way)
+SYM_FUNC_START(__twofish_enc_blk_3way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -270,9 +270,9 @@ ENTRY(__twofish_enc_blk_3way)
 	popq %r14;
 	popq %r15;
 	ret;
-ENDPROC(__twofish_enc_blk_3way)
+SYM_FUNC_END(__twofish_enc_blk_3way)
 
-ENTRY(twofish_dec_blk_3way)
+SYM_FUNC_START(twofish_dec_blk_3way)
 	/* input:
 	 *	%rdi: ctx, CTX
 	 *	%rsi: dst
@@ -309,4 +309,4 @@ ENTRY(twofish_dec_blk_3way)
 	popq %r14;
 	popq %r15;
 	ret;
-ENDPROC(twofish_dec_blk_3way)
+SYM_FUNC_END(twofish_dec_blk_3way)
diff --git a/arch/x86/crypto/twofish-x86_64-asm_64.S b/arch/x86/crypto/twofish-x86_64-asm_64.S
index a350c990dc86..74ef6c55d75f 100644
--- a/arch/x86/crypto/twofish-x86_64-asm_64.S
+++ b/arch/x86/crypto/twofish-x86_64-asm_64.S
@@ -215,7 +215,7 @@
 	xor	%r8d,		d ## D;\
 	ror	$1,		d ## D;
 
-ENTRY(twofish_enc_blk)
+SYM_FUNC_START(twofish_enc_blk)
 	pushq    R1
 
 	/* %rdi contains the ctx address */
@@ -266,9 +266,9 @@ ENTRY(twofish_enc_blk)
 	popq	R1
 	movl	$1,%eax
 	ret
-ENDPROC(twofish_enc_blk)
+SYM_FUNC_END(twofish_enc_blk)
 
-ENTRY(twofish_dec_blk)
+SYM_FUNC_START(twofish_dec_blk)
 	pushq    R1
 
 	/* %rdi contains the ctx address */
@@ -318,4 +318,4 @@ ENTRY(twofish_dec_blk)
 	popq	R1
 	movl	$1,%eax
 	ret
-ENDPROC(twofish_dec_blk)
+SYM_FUNC_END(twofish_dec_blk)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index ab71baad00fb..4de90f9daa1b 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -14,7 +14,7 @@
  *			at the top of the kernel process stack.
  *
  * Some macro usage:
- * - ENTRY/END:		Define functions in the symbol table.
+ * - SYM_FUNC_START/END:Define functions in the symbol table.
  * - TRACE_IRQ_*:	Trace hardirq state for lock debugging.
  * - idtentry:		Define exception entry points.
  */
@@ -43,10 +43,10 @@
 .section .entry.text, "ax"
 
 #ifdef CONFIG_PARAVIRT
-ENTRY(native_usergs_sysret64)
+SYM_FUNC_START(native_usergs_sysret64)
 	swapgs
 	sysretq
-ENDPROC(native_usergs_sysret64)
+SYM_FUNC_END(native_usergs_sysret64)
 #endif /* CONFIG_PARAVIRT */
 
 .macro TRACE_IRQS_IRETQ
@@ -134,7 +134,7 @@ ENDPROC(native_usergs_sysret64)
  * with them due to bugs in both AMD and Intel CPUs.
  */
 
-ENTRY(entry_SYSCALL_64)
+SYM_FUNC_START(entry_SYSCALL_64)
 	/*
 	 * Interrupts are off on entry.
 	 * We do not frame this tiny irq-off block with TRACE_IRQS_OFF/ON,
@@ -321,7 +321,7 @@ syscall_return_via_sysret:
 opportunistic_sysret_failed:
 	SWAPGS
 	jmp	restore_c_regs_and_iret
-ENDPROC(entry_SYSCALL_64)
+SYM_FUNC_END(entry_SYSCALL_64)
 
 SYM_FUNC_START_LOCAL(stub_ptregs_64)
 	/*
@@ -350,10 +350,10 @@ SYM_FUNC_START_LOCAL(stub_ptregs_64)
 SYM_FUNC_END(stub_ptregs_64)
 
 .macro ptregs_stub func
-ENTRY(ptregs_\func)
+SYM_FUNC_START(ptregs_\func)
 	leaq	\func(%rip), %rax
 	jmp	stub_ptregs_64
-ENDPROC(ptregs_\func)
+SYM_FUNC_END(ptregs_\func)
 .endm
 
 /* Instantiate ptregs_stub for each ptregs-using syscall */
@@ -366,7 +366,7 @@ ENDPROC(ptregs_\func)
  * %rdi: prev task
  * %rsi: next task
  */
-ENTRY(__switch_to_asm)
+SYM_FUNC_START(__switch_to_asm)
 	/*
 	 * Save callee-saved registers
 	 * This must match the order in inactive_task_frame
@@ -396,7 +396,7 @@ ENTRY(__switch_to_asm)
 	popq	%rbp
 
 	jmp	__switch_to
-ENDPROC(__switch_to_asm)
+SYM_FUNC_END(__switch_to_asm)
 
 /*
  * A newly forked process directly context switches into this address.
@@ -405,7 +405,7 @@ ENDPROC(__switch_to_asm)
  * rbx: kernel thread func (NULL for user thread)
  * r12: kernel thread arg
  */
-ENTRY(ret_from_fork)
+SYM_FUNC_START(ret_from_fork)
 	FRAME_BEGIN			/* help unwinder find end of stack */
 	movq	%rax, %rdi
 	call	schedule_tail		/* rdi: 'prev' task parameter */
@@ -432,14 +432,14 @@ ENTRY(ret_from_fork)
 	 */
 	movq	$0, RAX(%rsp)
 	jmp	2b
-ENDPROC(ret_from_fork)
+SYM_FUNC_END(ret_from_fork)
 
 /*
  * Build the entry stubs with some assembler magic.
  * We pack 1 stub into every 8-byte block.
  */
 	.align 8
-ENTRY(irq_entries_start)
+SYM_FUNC_START(irq_entries_start)
     vector=FIRST_EXTERNAL_VECTOR
     .rept (FIRST_SYSTEM_VECTOR - FIRST_EXTERNAL_VECTOR)
 	pushq	$(~vector+0x80)			/* Note: always in signed byte range */
@@ -447,7 +447,7 @@ ENTRY(irq_entries_start)
 	jmp	common_interrupt
 	.align	8
     .endr
-ENDPROC(irq_entries_start)
+SYM_FUNC_END(irq_entries_start)
 
 /*
  * Interrupt entry/exit.
@@ -654,13 +654,13 @@ SYM_FUNC_END(common_interrupt)
  * APIC interrupts.
  */
 .macro apicinterrupt3 num sym do_sym
-ENTRY(\sym)
+SYM_FUNC_START(\sym)
 	ASM_CLAC
 	pushq	$~(\num)
 .Lcommon_\sym:
 	interrupt \do_sym
 	jmp	ret_from_intr
-ENDPROC(\sym)
+SYM_FUNC_END(\sym)
 .endm
 
 #ifdef CONFIG_TRACING
@@ -739,7 +739,7 @@ apicinterrupt IRQ_WORK_VECTOR			irq_work_interrupt		smp_irq_work_interrupt
 #define CPU_TSS_IST(x) PER_CPU_VAR(cpu_tss) + (TSS_ist + ((x) - 1) * 8)
 
 .macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
-ENTRY(\sym)
+SYM_FUNC_START(\sym)
 	/* Sanity check */
 	.if \shift_ist != -1 && \paranoid == 0
 	.error "using shift_ist requires paranoid=1"
@@ -826,7 +826,7 @@ ENTRY(\sym)
 
 	jmp	error_exit			/* %ebx: no swapgs flag */
 	.endif
-ENDPROC(\sym)
+SYM_FUNC_END(\sym)
 .endm
 
 #ifdef CONFIG_TRACING
@@ -859,7 +859,7 @@ idtentry simd_coprocessor_error		do_simd_coprocessor_error	has_error_code=0
 	 * Reload gs selector with exception handling
 	 * edi:  new selector
 	 */
-ENTRY(native_load_gs_index)
+SYM_FUNC_START(native_load_gs_index)
 	pushfq
 	DISABLE_INTERRUPTS(CLBR_ANY & ~CLBR_RDI)
 	SWAPGS
@@ -869,7 +869,7 @@ ENTRY(native_load_gs_index)
 	SWAPGS
 	popfq
 	ret
-ENDPROC(native_load_gs_index)
+SYM_FUNC_END(native_load_gs_index)
 EXPORT_SYMBOL(native_load_gs_index)
 
 	_ASM_EXTABLE(.Lgs_change, bad_gs)
@@ -890,7 +890,7 @@ SYM_FUNC_END(bad_gs)
 	.previous
 
 /* Call softirq on interrupt stack. Interrupts are off. */
-ENTRY(do_softirq_own_stack)
+SYM_FUNC_START(do_softirq_own_stack)
 	pushq	%rbp
 	mov	%rsp, %rbp
 	incl	PER_CPU_VAR(irq_count)
@@ -900,7 +900,7 @@ ENTRY(do_softirq_own_stack)
 	leaveq
 	decl	PER_CPU_VAR(irq_count)
 	ret
-ENDPROC(do_softirq_own_stack)
+SYM_FUNC_END(do_softirq_own_stack)
 
 #ifdef CONFIG_XEN
 idtentry xen_hypervisor_callback xen_do_hypervisor_callback has_error_code=0
@@ -952,7 +952,7 @@ SYM_FUNC_END(xen_do_hypervisor_callback)
  * We distinguish between categories by comparing each saved segment register
  * with its current contents: any discrepancy means we in category 1.
  */
-ENTRY(xen_failsafe_callback)
+SYM_FUNC_START(xen_failsafe_callback)
 	movl	%ds, %ecx
 	cmpw	%cx, 0x10(%rsp)
 	jne	1f
@@ -983,7 +983,7 @@ ENTRY(xen_failsafe_callback)
 	SAVE_EXTRA_REGS
 	ENCODE_FRAME_POINTER
 	jmp	error_exit
-ENDPROC(xen_failsafe_callback)
+SYM_FUNC_END(xen_failsafe_callback)
 
 apicinterrupt3 HYPERVISOR_CALLBACK_VECTOR \
 	xen_hvm_callback_vector xen_evtchn_do_upcall
@@ -1162,7 +1162,7 @@ SYM_FUNC_START_LOCAL(error_exit)
 SYM_FUNC_END(error_exit)
 
 /* Runs on exception stack */
-ENTRY(nmi)
+SYM_FUNC_START(nmi)
 	/*
 	 * Fix up the exception frame if we're on Xen.
 	 * PARAVIRT_ADJUST_EXCEPTION_FRAME is guaranteed to push at most
@@ -1507,14 +1507,14 @@ nmi_restore:
 	 * mode, so this cannot result in a fault.
 	 */
 	INTERRUPT_RETURN
-ENDPROC(nmi)
+SYM_FUNC_END(nmi)
 
-ENTRY(ignore_sysret)
+SYM_FUNC_START(ignore_sysret)
 	mov	$-ENOSYS, %eax
 	sysret
-ENDPROC(ignore_sysret)
+SYM_FUNC_END(ignore_sysret)
 
-ENTRY(rewind_stack_do_exit)
+SYM_FUNC_START(rewind_stack_do_exit)
 	/* Prevent any naive code from trying to unwind to our caller. */
 	xorl	%ebp, %ebp
 
@@ -1523,4 +1523,4 @@ ENTRY(rewind_stack_do_exit)
 
 	call	do_exit
 1:	jmp 1b
-ENDPROC(rewind_stack_do_exit)
+SYM_FUNC_END(rewind_stack_do_exit)
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index b7934ef3f5bb..c6163b3abc8c 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -45,7 +45,7 @@
  * ebp  user stack
  * 0(%ebp) arg6
  */
-ENTRY(entry_SYSENTER_compat)
+SYM_FUNC_START(entry_SYSENTER_compat)
 	/* Interrupts are off on entry. */
 	SWAPGS_UNSAFE_STACK
 	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
@@ -132,7 +132,7 @@ ENTRY(entry_SYSENTER_compat)
 	popfq
 	jmp	.Lsysenter_flags_fixed
 SYM_FUNC_INNER_LABEL(__end_entry_SYSENTER_compat, SYM_V_GLOBAL)
-ENDPROC(entry_SYSENTER_compat)
+SYM_FUNC_END(entry_SYSENTER_compat)
 
 /*
  * 32-bit SYSCALL entry.
@@ -181,7 +181,7 @@ ENDPROC(entry_SYSENTER_compat)
  * esp  user stack
  * 0(%esp) arg6
  */
-ENTRY(entry_SYSCALL_compat)
+SYM_FUNC_START(entry_SYSCALL_compat)
 	/* Interrupts are off on entry. */
 	SWAPGS_UNSAFE_STACK
 
@@ -262,7 +262,7 @@ sysret32_from_system_call:
 	movq	RSP-ORIG_RAX(%rsp), %rsp
 	swapgs
 	sysretl
-ENDPROC(entry_SYSCALL_compat)
+SYM_FUNC_END(entry_SYSCALL_compat)
 
 /*
  * 32-bit legacy system call entry.
@@ -290,7 +290,7 @@ ENDPROC(entry_SYSCALL_compat)
  * edi  arg5
  * ebp  arg6
  */
-ENTRY(entry_INT80_compat)
+SYM_FUNC_START(entry_INT80_compat)
 	/*
 	 * Interrupts are off on entry.
 	 */
@@ -340,9 +340,9 @@ ENTRY(entry_INT80_compat)
 	TRACE_IRQS_ON
 	SWAPGS
 	jmp	restore_regs_and_iret
-ENDPROC(entry_INT80_compat)
+SYM_FUNC_END(entry_INT80_compat)
 
-ENTRY(stub32_clone)
+SYM_FUNC_START(stub32_clone)
 	/*
 	 * The 32-bit clone ABI is: clone(..., int tls_val, int *child_tidptr).
 	 * The 64-bit clone ABI is: clone(..., int *child_tidptr, int tls_val).
@@ -352,4 +352,4 @@ ENTRY(stub32_clone)
 	 */
 	xchg	%r8, %rcx
 	jmp	sys_clone
-ENDPROC(stub32_clone)
+SYM_FUNC_END(stub32_clone)
diff --git a/arch/x86/kernel/acpi/wakeup_64.S b/arch/x86/kernel/acpi/wakeup_64.S
index 8fca92dd9144..c159b25bf056 100644
--- a/arch/x86/kernel/acpi/wakeup_64.S
+++ b/arch/x86/kernel/acpi/wakeup_64.S
@@ -13,7 +13,7 @@
 	/*
 	 * Hooray, we are in Long 64-bit mode (but still running in low memory)
 	 */
-ENTRY(wakeup_long64)
+SYM_FUNC_START(wakeup_long64)
 	movq	saved_magic, %rax
 	movq	$0x123456789abcdef0, %rdx
 	cmpq	%rdx, %rax
@@ -34,13 +34,13 @@ ENTRY(wakeup_long64)
 
 	movq	saved_rip, %rax
 	jmp	*%rax
-ENDPROC(wakeup_long64)
+SYM_FUNC_END(wakeup_long64)
 
 SYM_FUNC_START_LOCAL(bogus_64_magic)
 	jmp	bogus_64_magic
 SYM_FUNC_END(bogus_64_magic)
 
-ENTRY(do_suspend_lowlevel)
+SYM_FUNC_START(do_suspend_lowlevel)
 	FRAME_BEGIN
 	subq	$8, %rsp
 	xorl	%eax, %eax
@@ -123,7 +123,7 @@ ENTRY(do_suspend_lowlevel)
 	addq	$8, %rsp
 	FRAME_END
 	jmp	restore_processor_state
-ENDPROC(do_suspend_lowlevel)
+SYM_FUNC_END(do_suspend_lowlevel)
 
 .data
 SYM_DATA_SIMPLE_LOCAL(saved_rbp, .quad 0)
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index aef60bbe854d..0970d85693c4 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -145,11 +145,11 @@ EXPORT_SYMBOL(mcount)
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 
-ENTRY(function_hook)
+SYM_FUNC_START(function_hook)
 	retq
-ENDPROC(function_hook)
+SYM_FUNC_END(function_hook)
 
-ENTRY(ftrace_caller)
+SYM_FUNC_START(ftrace_caller)
 	/* save_mcount_regs fills in first two parameters */
 	save_mcount_regs
 
@@ -183,9 +183,9 @@ SYM_FUNC_INNER_LABEL(ftrace_graph_call, SYM_V_GLOBAL)
 /* This is weak to keep gas from relaxing the jumps */
 WEAK(ftrace_stub)
 	retq
-ENDPROC(ftrace_caller)
+SYM_FUNC_END(ftrace_caller)
 
-ENTRY(ftrace_regs_caller)
+SYM_FUNC_START(ftrace_regs_caller)
 	/* Save the current flags before any operations that can change them */
 	pushfq
 
@@ -254,12 +254,12 @@ SYM_FUNC_INNER_LABEL(ftrace_regs_caller_end, SYM_V_GLOBAL)
 
 	jmp ftrace_epilogue
 
-ENDPROC(ftrace_regs_caller)
+SYM_FUNC_END(ftrace_regs_caller)
 
 
 #else /* ! CONFIG_DYNAMIC_FTRACE */
 
-ENTRY(function_hook)
+SYM_FUNC_START(function_hook)
 	cmpq $ftrace_stub, ftrace_trace_function
 	jnz trace
 
@@ -290,11 +290,11 @@ trace:
 	restore_mcount_regs
 
 	jmp fgraph_trace
-ENDPROC(function_hook)
+SYM_FUNC_END(function_hook)
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
-ENTRY(ftrace_graph_caller)
+SYM_FUNC_START(ftrace_graph_caller)
 	/* Saves rbp into %rdx and fills first parameter  */
 	save_mcount_regs
 
@@ -312,9 +312,9 @@ ENTRY(ftrace_graph_caller)
 	restore_mcount_regs
 
 	retq
-ENDPROC(ftrace_graph_caller)
+SYM_FUNC_END(ftrace_graph_caller)
 
-ENTRY(return_to_handler)
+SYM_FUNC_START(return_to_handler)
 	subq  $24, %rsp
 
 	/* Save the return values */
@@ -329,5 +329,5 @@ ENTRY(return_to_handler)
 	movq (%rsp), %rax
 	addq $24, %rsp
 	jmp *%rdi
-ENDPROC(return_to_handler)
+SYM_FUNC_END(return_to_handler)
 #endif
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 3c9037a65ee9..a08c5d891a80 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -46,7 +46,7 @@ L3_START_KERNEL = pud_index(__START_KERNEL_map)
 	.text
 	__HEAD
 	.code64
-ENTRY(startup_64)
+SYM_FUNC_START(startup_64)
 	/*
 	 * At this point the CPU runs in 64bit mode CS.L = 1 CS.D = 0,
 	 * and someone has loaded an identity mapped page table
@@ -167,9 +167,9 @@ ENTRY(startup_64)
 .Lskip_fixup:
 	movq	$(early_level4_pgt - __START_KERNEL_map), %rax
 	jmp 1f
-ENDPROC(startup_64)
+SYM_FUNC_END(startup_64)
 
-ENTRY(secondary_startup_64)
+SYM_FUNC_START(secondary_startup_64)
 	/*
 	 * At this point the CPU runs in 64bit mode CS.L = 1 CS.D = 0,
 	 * and someone has loaded a mapped page table.
@@ -304,7 +304,7 @@ ENTRY(secondary_startup_64)
 	pushq	%rax		# target address in negative space
 	lretq
 .Lafter_lret:
-ENDPROC(secondary_startup_64)
+SYM_FUNC_END(secondary_startup_64)
 
 #include "verify_cpu.S"
 
@@ -314,10 +314,10 @@ ENDPROC(secondary_startup_64)
  * up already except stack. We just set up stack here. Then call
  * start_secondary() via .Ljump_to_C_code.
  */
-ENTRY(start_cpu0)
+SYM_FUNC_START(start_cpu0)
 	movq	initial_stack(%rip), %rsp
 	jmp	.Ljump_to_C_code
-ENDPROC(start_cpu0)
+SYM_FUNC_END(start_cpu0)
 #endif
 
 	/* Both SMP bootup and ACPI suspend change these variables */
@@ -338,7 +338,7 @@ SYM_FUNC_START_LOCAL(bad_address)
 SYM_FUNC_END(bad_address)
 
 	__INIT
-ENTRY(early_idt_handler_array)
+SYM_FUNC_START(early_idt_handler_array)
 	# 104(%rsp) %rflags
 	#  96(%rsp) %cs
 	#  88(%rsp) %rip
@@ -353,7 +353,7 @@ ENTRY(early_idt_handler_array)
 	i = i + 1
 	.fill early_idt_handler_array + i*EARLY_IDT_HANDLER_SIZE - ., 1, 0xcc
 	.endr
-ENDPROC(early_idt_handler_array)
+SYM_FUNC_END(early_idt_handler_array)
 
 SYM_FUNC_START_LOCAL(early_idt_handler_common)
 	/*
diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S
index 4d34bb548b41..a048436ce3ac 100644
--- a/arch/x86/lib/checksum_32.S
+++ b/arch/x86/lib/checksum_32.S
@@ -283,7 +283,7 @@ unsigned int csum_partial_copy_generic (const char *src, char *dst,
 #define ARGBASE 16		
 #define FP		12
 		
-ENTRY(csum_partial_copy_generic)
+SYM_FUNC_START(csum_partial_copy_generic)
 	subl  $4,%esp	
 	pushl %edi
 	pushl %esi
@@ -401,7 +401,7 @@ DST(	movb %cl, (%edi)	)
 	popl %edi
 	popl %ecx			# equivalent to addl $4,%esp
 	ret	
-ENDPROC(csum_partial_copy_generic)
+SYM_FUNC_END(csum_partial_copy_generic)
 
 #else
 
@@ -419,7 +419,7 @@ ENDPROC(csum_partial_copy_generic)
 
 #define ARGBASE 12
 		
-ENTRY(csum_partial_copy_generic)
+SYM_FUNC_START(csum_partial_copy_generic)
 	pushl %ebx
 	pushl %edi
 	pushl %esi
@@ -486,7 +486,7 @@ DST(	movb %dl, (%edi)         )
 	popl %edi
 	popl %ebx
 	ret
-ENDPROC(csum_partial_copy_generic)
+SYM_FUNC_END(csum_partial_copy_generic)
 				
 #undef ROUND
 #undef ROUND1		
diff --git a/arch/x86/lib/clear_page_64.S b/arch/x86/lib/clear_page_64.S
index 81b1635d67de..8c4c26c282fd 100644
--- a/arch/x86/lib/clear_page_64.S
+++ b/arch/x86/lib/clear_page_64.S
@@ -14,15 +14,15 @@
  * Zero a page.
  * %rdi	- page
  */
-ENTRY(clear_page_rep)
+SYM_FUNC_START(clear_page_rep)
 	movl $4096/8,%ecx
 	xorl %eax,%eax
 	rep stosq
 	ret
-ENDPROC(clear_page_rep)
+SYM_FUNC_END(clear_page_rep)
 EXPORT_SYMBOL_GPL(clear_page_rep)
 
-ENTRY(clear_page_orig)
+SYM_FUNC_START(clear_page_orig)
 	xorl   %eax,%eax
 	movl   $4096/64,%ecx
 	.p2align 4
@@ -41,13 +41,13 @@ ENTRY(clear_page_orig)
 	jnz	.Lloop
 	nop
 	ret
-ENDPROC(clear_page_orig)
+SYM_FUNC_END(clear_page_orig)
 EXPORT_SYMBOL_GPL(clear_page_orig)
 
-ENTRY(clear_page_erms)
+SYM_FUNC_START(clear_page_erms)
 	movl $4096,%ecx
 	xorl %eax,%eax
 	rep stosb
 	ret
-ENDPROC(clear_page_erms)
+SYM_FUNC_END(clear_page_erms)
 EXPORT_SYMBOL_GPL(clear_page_erms)
diff --git a/arch/x86/lib/cmpxchg16b_emu.S b/arch/x86/lib/cmpxchg16b_emu.S
index 9b330242e740..b6ba6360b3ca 100644
--- a/arch/x86/lib/cmpxchg16b_emu.S
+++ b/arch/x86/lib/cmpxchg16b_emu.S
@@ -19,7 +19,7 @@
  * %rcx : high 64 bits of new value
  * %al  : Operation successful
  */
-ENTRY(this_cpu_cmpxchg16b_emu)
+SYM_FUNC_START(this_cpu_cmpxchg16b_emu)
 
 #
 # Emulate 'cmpxchg16b %gs:(%rsi)' except we return the result in %al not
@@ -50,4 +50,4 @@ ENTRY(this_cpu_cmpxchg16b_emu)
 	xor %al,%al
 	ret
 
-ENDPROC(this_cpu_cmpxchg16b_emu)
+SYM_FUNC_END(this_cpu_cmpxchg16b_emu)
diff --git a/arch/x86/lib/cmpxchg8b_emu.S b/arch/x86/lib/cmpxchg8b_emu.S
index 03a186fc06ea..77aa18db3968 100644
--- a/arch/x86/lib/cmpxchg8b_emu.S
+++ b/arch/x86/lib/cmpxchg8b_emu.S
@@ -19,7 +19,7 @@
  * %ebx : low 32 bits of new value
  * %ecx : high 32 bits of new value
  */
-ENTRY(cmpxchg8b_emu)
+SYM_FUNC_START(cmpxchg8b_emu)
 
 #
 # Emulate 'cmpxchg8b (%esi)' on UP except we don't
@@ -48,5 +48,5 @@ ENTRY(cmpxchg8b_emu)
 	popfl
 	ret
 
-ENDPROC(cmpxchg8b_emu)
+SYM_FUNC_END(cmpxchg8b_emu)
 EXPORT_SYMBOL(cmpxchg8b_emu)
diff --git a/arch/x86/lib/copy_page_64.S b/arch/x86/lib/copy_page_64.S
index e1ee50bc161a..6ba635d23fcc 100644
--- a/arch/x86/lib/copy_page_64.S
+++ b/arch/x86/lib/copy_page_64.S
@@ -12,12 +12,12 @@
  * prefetch distance based on SMP/UP.
  */
 	ALIGN
-ENTRY(copy_page)
+SYM_FUNC_START(copy_page)
 	ALTERNATIVE "jmp copy_page_regs", "", X86_FEATURE_REP_GOOD
 	movl	$4096/8, %ecx
 	rep	movsq
 	ret
-ENDPROC(copy_page)
+SYM_FUNC_END(copy_page)
 EXPORT_SYMBOL(copy_page)
 
 SYM_FUNC_START_LOCAL(copy_page_regs)
diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S
index c5959576c315..a4253494faa8 100644
--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -29,7 +29,7 @@
  * Output:
  * eax uncopied bytes or 0 if successful.
  */
-ENTRY(copy_user_generic_unrolled)
+SYM_FUNC_START(copy_user_generic_unrolled)
 	ASM_STAC
 	cmpl $8,%edx
 	jb 20f		/* less then 8 bytes, go to byte copy loop */
@@ -111,7 +111,7 @@ ENTRY(copy_user_generic_unrolled)
 	_ASM_EXTABLE(19b,40b)
 	_ASM_EXTABLE(21b,50b)
 	_ASM_EXTABLE(22b,50b)
-ENDPROC(copy_user_generic_unrolled)
+SYM_FUNC_END(copy_user_generic_unrolled)
 EXPORT_SYMBOL(copy_user_generic_unrolled)
 
 /* Some CPUs run faster using the string copy instructions.
@@ -132,7 +132,7 @@ EXPORT_SYMBOL(copy_user_generic_unrolled)
  * Output:
  * eax uncopied bytes or 0 if successful.
  */
-ENTRY(copy_user_generic_string)
+SYM_FUNC_START(copy_user_generic_string)
 	ASM_STAC
 	cmpl $8,%edx
 	jb 2f		/* less than 8 bytes, go to byte copy loop */
@@ -157,7 +157,7 @@ ENTRY(copy_user_generic_string)
 
 	_ASM_EXTABLE(1b,11b)
 	_ASM_EXTABLE(3b,12b)
-ENDPROC(copy_user_generic_string)
+SYM_FUNC_END(copy_user_generic_string)
 EXPORT_SYMBOL(copy_user_generic_string)
 
 /*
@@ -172,7 +172,7 @@ EXPORT_SYMBOL(copy_user_generic_string)
  * Output:
  * eax uncopied bytes or 0 if successful.
  */
-ENTRY(copy_user_enhanced_fast_string)
+SYM_FUNC_START(copy_user_enhanced_fast_string)
 	ASM_STAC
 	movl %edx,%ecx
 1:	rep
@@ -187,7 +187,7 @@ ENTRY(copy_user_enhanced_fast_string)
 	.previous
 
 	_ASM_EXTABLE(1b,12b)
-ENDPROC(copy_user_enhanced_fast_string)
+SYM_FUNC_END(copy_user_enhanced_fast_string)
 EXPORT_SYMBOL(copy_user_enhanced_fast_string)
 
 /*
@@ -199,7 +199,7 @@ EXPORT_SYMBOL(copy_user_enhanced_fast_string)
  *  - Require 8-byte alignment when size is 8 bytes or larger.
  *  - Require 4-byte alignment when size is 4 bytes.
  */
-ENTRY(__copy_user_nocache)
+SYM_FUNC_START(__copy_user_nocache)
 	ASM_STAC
 
 	/* If size is less than 8 bytes, go to 4-byte copy */
@@ -338,5 +338,5 @@ ENTRY(__copy_user_nocache)
 	_ASM_EXTABLE(31b,.L_fixup_4b_copy)
 	_ASM_EXTABLE(40b,.L_fixup_1b_copy)
 	_ASM_EXTABLE(41b,.L_fixup_1b_copy)
-ENDPROC(__copy_user_nocache)
+SYM_FUNC_END(__copy_user_nocache)
 EXPORT_SYMBOL(__copy_user_nocache)
diff --git a/arch/x86/lib/csum-copy_64.S b/arch/x86/lib/csum-copy_64.S
index 7e48807b2fa1..e93301673bb1 100644
--- a/arch/x86/lib/csum-copy_64.S
+++ b/arch/x86/lib/csum-copy_64.S
@@ -45,7 +45,7 @@
 	.endm
 
 
-ENTRY(csum_partial_copy_generic)
+SYM_FUNC_START(csum_partial_copy_generic)
 	cmpl	$3*64, %edx
 	jle	.Lignore
 
@@ -221,4 +221,4 @@ ENTRY(csum_partial_copy_generic)
 	jz   .Lende
 	movl $-EFAULT, (%rax)
 	jmp .Lende
-ENDPROC(csum_partial_copy_generic)
+SYM_FUNC_END(csum_partial_copy_generic)
diff --git a/arch/x86/lib/getuser.S b/arch/x86/lib/getuser.S
index 29f0707a3913..56b4a13678f5 100644
--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@ -35,7 +35,7 @@
 #include <asm/export.h>
 
 	.text
-ENTRY(__get_user_1)
+SYM_FUNC_START(__get_user_1)
 	mov PER_CPU_VAR(current_task), %_ASM_DX
 	cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
 	jae bad_get_user
@@ -44,10 +44,10 @@ ENTRY(__get_user_1)
 	xor %eax,%eax
 	ASM_CLAC
 	ret
-ENDPROC(__get_user_1)
+SYM_FUNC_END(__get_user_1)
 EXPORT_SYMBOL(__get_user_1)
 
-ENTRY(__get_user_2)
+SYM_FUNC_START(__get_user_2)
 	add $1,%_ASM_AX
 	jc bad_get_user
 	mov PER_CPU_VAR(current_task), %_ASM_DX
@@ -58,10 +58,10 @@ ENTRY(__get_user_2)
 	xor %eax,%eax
 	ASM_CLAC
 	ret
-ENDPROC(__get_user_2)
+SYM_FUNC_END(__get_user_2)
 EXPORT_SYMBOL(__get_user_2)
 
-ENTRY(__get_user_4)
+SYM_FUNC_START(__get_user_4)
 	add $3,%_ASM_AX
 	jc bad_get_user
 	mov PER_CPU_VAR(current_task), %_ASM_DX
@@ -72,10 +72,10 @@ ENTRY(__get_user_4)
 	xor %eax,%eax
 	ASM_CLAC
 	ret
-ENDPROC(__get_user_4)
+SYM_FUNC_END(__get_user_4)
 EXPORT_SYMBOL(__get_user_4)
 
-ENTRY(__get_user_8)
+SYM_FUNC_START(__get_user_8)
 #ifdef CONFIG_X86_64
 	add $7,%_ASM_AX
 	jc bad_get_user
@@ -100,7 +100,7 @@ ENTRY(__get_user_8)
 	ASM_CLAC
 	ret
 #endif
-ENDPROC(__get_user_8)
+SYM_FUNC_END(__get_user_8)
 EXPORT_SYMBOL(__get_user_8)
 
 
diff --git a/arch/x86/lib/hweight.S b/arch/x86/lib/hweight.S
index 23d893cbc200..f520a1a92ef6 100644
--- a/arch/x86/lib/hweight.S
+++ b/arch/x86/lib/hweight.S
@@ -7,7 +7,7 @@
  * unsigned int __sw_hweight32(unsigned int w)
  * %rdi: w
  */
-ENTRY(__sw_hweight32)
+SYM_FUNC_START(__sw_hweight32)
 
 #ifdef CONFIG_X86_64
 	movl %edi, %eax				# w
@@ -32,10 +32,10 @@ ENTRY(__sw_hweight32)
 	shrl $24, %eax				# w = w_tmp >> 24
 	__ASM_SIZE(pop,) %__ASM_REG(dx)
 	ret
-ENDPROC(__sw_hweight32)
+SYM_FUNC_END(__sw_hweight32)
 EXPORT_SYMBOL(__sw_hweight32)
 
-ENTRY(__sw_hweight64)
+SYM_FUNC_START(__sw_hweight64)
 #ifdef CONFIG_X86_64
 	pushq   %rdi
 	pushq   %rdx
@@ -78,5 +78,5 @@ ENTRY(__sw_hweight64)
 	popl    %ecx
 	ret
 #endif
-ENDPROC(__sw_hweight64)
+SYM_FUNC_END(__sw_hweight64)
 EXPORT_SYMBOL(__sw_hweight64)
diff --git a/arch/x86/lib/iomap_copy_64.S b/arch/x86/lib/iomap_copy_64.S
index 33147fef3452..2246fbf32fa8 100644
--- a/arch/x86/lib/iomap_copy_64.S
+++ b/arch/x86/lib/iomap_copy_64.S
@@ -20,8 +20,8 @@
 /*
  * override generic version in lib/iomap_copy.c
  */
-ENTRY(__iowrite32_copy)
+SYM_FUNC_START(__iowrite32_copy)
 	movl %edx,%ecx
 	rep movsd
 	ret
-ENDPROC(__iowrite32_copy)
+SYM_FUNC_END(__iowrite32_copy)
diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S
index 728703c47d58..9bec63e212a8 100644
--- a/arch/x86/lib/memcpy_64.S
+++ b/arch/x86/lib/memcpy_64.S
@@ -188,7 +188,7 @@ SYM_FUNC_END(memcpy_orig)
  * Note that we only catch machine checks when reading the source addresses.
  * Writes to target are posted and don't generate machine checks.
  */
-ENTRY(memcpy_mcsafe_unrolled)
+SYM_FUNC_START(memcpy_mcsafe_unrolled)
 	cmpl $8, %edx
 	/* Less than 8 bytes? Go to byte copy loop */
 	jb .L_no_whole_words
@@ -276,7 +276,7 @@ ENTRY(memcpy_mcsafe_unrolled)
 .L_done_memcpy_trap:
 	xorq %rax, %rax
 	ret
-ENDPROC(memcpy_mcsafe_unrolled)
+SYM_FUNC_END(memcpy_mcsafe_unrolled)
 EXPORT_SYMBOL_GPL(memcpy_mcsafe_unrolled)
 
 	.section .fixup, "ax"
diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S
index d22af97e5b27..4f3f6359fcf9 100644
--- a/arch/x86/lib/memmove_64.S
+++ b/arch/x86/lib/memmove_64.S
@@ -26,7 +26,7 @@
 .weak memmove
 
 SYM_FUNC_START_ALIAS(memmove)
-ENTRY(__memmove)
+SYM_FUNC_START(__memmove)
 
 	/* Handle more 32 bytes in loop */
 	mov %rdi, %rax
@@ -206,7 +206,7 @@ ENTRY(__memmove)
 	movb %r11b, (%rdi)
 13:
 	retq
-ENDPROC(__memmove)
+SYM_FUNC_END(__memmove)
 SYM_FUNC_END_ALIAS(memmove)
 EXPORT_SYMBOL(__memmove)
 EXPORT_SYMBOL(memmove)
diff --git a/arch/x86/lib/memset_64.S b/arch/x86/lib/memset_64.S
index c63ae9987612..cee5514926e6 100644
--- a/arch/x86/lib/memset_64.S
+++ b/arch/x86/lib/memset_64.S
@@ -19,7 +19,7 @@
  * rax   original destination
  */
 SYM_FUNC_START_ALIAS(memset)
-ENTRY(__memset)
+SYM_FUNC_START(__memset)
 	/*
 	 * Some CPUs support enhanced REP MOVSB/STOSB feature. It is recommended
 	 * to use it when possible. If not available, use fast string instructions.
@@ -42,7 +42,7 @@ ENTRY(__memset)
 	rep stosb
 	movq %r9,%rax
 	ret
-ENDPROC(__memset)
+SYM_FUNC_END(__memset)
 SYM_FUNC_END_ALIAS(memset)
 EXPORT_SYMBOL(memset)
 EXPORT_SYMBOL(__memset)
diff --git a/arch/x86/lib/msr-reg.S b/arch/x86/lib/msr-reg.S
index c81556409bbb..1e8a9cb07b4f 100644
--- a/arch/x86/lib/msr-reg.S
+++ b/arch/x86/lib/msr-reg.S
@@ -11,7 +11,7 @@
  *
  */
 .macro op_safe_regs op
-ENTRY(\op\()_safe_regs)
+SYM_FUNC_START(\op\()_safe_regs)
 	pushq %rbx
 	pushq %rbp
 	movq	%rdi, %r10	/* Save pointer */
@@ -40,13 +40,13 @@ ENTRY(\op\()_safe_regs)
 	jmp     2b
 
 	_ASM_EXTABLE(1b, 3b)
-ENDPROC(\op\()_safe_regs)
+SYM_FUNC_END(\op\()_safe_regs)
 .endm
 
 #else /* X86_32 */
 
 .macro op_safe_regs op
-ENTRY(\op\()_safe_regs)
+SYM_FUNC_START(\op\()_safe_regs)
 	pushl %ebx
 	pushl %ebp
 	pushl %esi
@@ -82,7 +82,7 @@ ENTRY(\op\()_safe_regs)
 	jmp     2b
 
 	_ASM_EXTABLE(1b, 3b)
-ENDPROC(\op\()_safe_regs)
+SYM_FUNC_END(\op\()_safe_regs)
 .endm
 
 #endif
diff --git a/arch/x86/lib/putuser.S b/arch/x86/lib/putuser.S
index d77883f36875..4d015af97968 100644
--- a/arch/x86/lib/putuser.S
+++ b/arch/x86/lib/putuser.S
@@ -35,7 +35,7 @@
 		ret
 
 .text
-ENTRY(__put_user_1)
+SYM_FUNC_START(__put_user_1)
 	ENTER
 	cmp TASK_addr_limit(%_ASM_BX),%_ASM_CX
 	jae bad_put_user
@@ -43,10 +43,10 @@ ENTRY(__put_user_1)
 1:	movb %al,(%_ASM_CX)
 	xor %eax,%eax
 	EXIT
-ENDPROC(__put_user_1)
+SYM_FUNC_END(__put_user_1)
 EXPORT_SYMBOL(__put_user_1)
 
-ENTRY(__put_user_2)
+SYM_FUNC_START(__put_user_2)
 	ENTER
 	mov TASK_addr_limit(%_ASM_BX),%_ASM_BX
 	sub $1,%_ASM_BX
@@ -56,10 +56,10 @@ ENTRY(__put_user_2)
 2:	movw %ax,(%_ASM_CX)
 	xor %eax,%eax
 	EXIT
-ENDPROC(__put_user_2)
+SYM_FUNC_END(__put_user_2)
 EXPORT_SYMBOL(__put_user_2)
 
-ENTRY(__put_user_4)
+SYM_FUNC_START(__put_user_4)
 	ENTER
 	mov TASK_addr_limit(%_ASM_BX),%_ASM_BX
 	sub $3,%_ASM_BX
@@ -69,10 +69,10 @@ ENTRY(__put_user_4)
 3:	movl %eax,(%_ASM_CX)
 	xor %eax,%eax
 	EXIT
-ENDPROC(__put_user_4)
+SYM_FUNC_END(__put_user_4)
 EXPORT_SYMBOL(__put_user_4)
 
-ENTRY(__put_user_8)
+SYM_FUNC_START(__put_user_8)
 	ENTER
 	mov TASK_addr_limit(%_ASM_BX),%_ASM_BX
 	sub $7,%_ASM_BX
@@ -85,7 +85,7 @@ ENTRY(__put_user_8)
 #endif
 	xor %eax,%eax
 	EXIT
-ENDPROC(__put_user_8)
+SYM_FUNC_END(__put_user_8)
 EXPORT_SYMBOL(__put_user_8)
 
 SYM_FUNC_START_LOCAL(bad_put_user)
diff --git a/arch/x86/lib/rwsem.S b/arch/x86/lib/rwsem.S
index bf2c6074efd2..e3905664944d 100644
--- a/arch/x86/lib/rwsem.S
+++ b/arch/x86/lib/rwsem.S
@@ -86,7 +86,7 @@
 #endif
 
 /* Fix up special calling conventions */
-ENTRY(call_rwsem_down_read_failed)
+SYM_FUNC_START(call_rwsem_down_read_failed)
 	FRAME_BEGIN
 	save_common_regs
 	__ASM_SIZE(push,) %__ASM_REG(dx)
@@ -96,9 +96,9 @@ ENTRY(call_rwsem_down_read_failed)
 	restore_common_regs
 	FRAME_END
 	ret
-ENDPROC(call_rwsem_down_read_failed)
+SYM_FUNC_END(call_rwsem_down_read_failed)
 
-ENTRY(call_rwsem_down_write_failed)
+SYM_FUNC_START(call_rwsem_down_write_failed)
 	FRAME_BEGIN
 	save_common_regs
 	movq %rax,%rdi
@@ -106,9 +106,9 @@ ENTRY(call_rwsem_down_write_failed)
 	restore_common_regs
 	FRAME_END
 	ret
-ENDPROC(call_rwsem_down_write_failed)
+SYM_FUNC_END(call_rwsem_down_write_failed)
 
-ENTRY(call_rwsem_down_write_failed_killable)
+SYM_FUNC_START(call_rwsem_down_write_failed_killable)
 	FRAME_BEGIN
 	save_common_regs
 	movq %rax,%rdi
@@ -116,9 +116,9 @@ ENTRY(call_rwsem_down_write_failed_killable)
 	restore_common_regs
 	FRAME_END
 	ret
-ENDPROC(call_rwsem_down_write_failed_killable)
+SYM_FUNC_END(call_rwsem_down_write_failed_killable)
 
-ENTRY(call_rwsem_wake)
+SYM_FUNC_START(call_rwsem_wake)
 	FRAME_BEGIN
 	/* do nothing if still outstanding active readers */
 	__ASM_HALF_SIZE(dec) %__ASM_HALF_REG(dx)
@@ -129,9 +129,9 @@ ENTRY(call_rwsem_wake)
 	restore_common_regs
 1:	FRAME_END
 	ret
-ENDPROC(call_rwsem_wake)
+SYM_FUNC_END(call_rwsem_wake)
 
-ENTRY(call_rwsem_downgrade_wake)
+SYM_FUNC_START(call_rwsem_downgrade_wake)
 	FRAME_BEGIN
 	save_common_regs
 	__ASM_SIZE(push,) %__ASM_REG(dx)
@@ -141,4 +141,4 @@ ENTRY(call_rwsem_downgrade_wake)
 	restore_common_regs
 	FRAME_END
 	ret
-ENDPROC(call_rwsem_downgrade_wake)
+SYM_FUNC_END(call_rwsem_downgrade_wake)
diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
index 823edd6f1db7..8e7075280148 100644
--- a/arch/x86/net/bpf_jit.S
+++ b/arch/x86/net/bpf_jit.S
@@ -23,12 +23,12 @@
 	32 /* space for rbx,r13,r14,r15 */ + \
 	8 /* space for skb_copy_bits */)
 
-ENTRY(sk_load_word)
+SYM_FUNC_START(sk_load_word)
 	test	%esi,%esi
 	js	bpf_slow_path_word_neg
-ENDPROC(sk_load_word)
+SYM_FUNC_END(sk_load_word)
 
-ENTRY(sk_load_word_positive_offset)
+SYM_FUNC_START(sk_load_word_positive_offset)
 	mov	%r9d,%eax		# hlen
 	sub	%esi,%eax		# hlen - offset
 	cmp	$3,%eax
@@ -36,14 +36,14 @@ ENTRY(sk_load_word_positive_offset)
 	mov     (SKBDATA,%rsi),%eax
 	bswap   %eax  			/* ntohl() */
 	ret
-ENDPROC(sk_load_word_positive_offset)
+SYM_FUNC_END(sk_load_word_positive_offset)
 
-ENTRY(sk_load_half)
+SYM_FUNC_START(sk_load_half)
 	test	%esi,%esi
 	js	bpf_slow_path_half_neg
-ENDPROC(sk_load_half)
+SYM_FUNC_END(sk_load_half)
 
-ENTRY(sk_load_half_positive_offset)
+SYM_FUNC_START(sk_load_half_positive_offset)
 	mov	%r9d,%eax
 	sub	%esi,%eax		#	hlen - offset
 	cmp	$1,%eax
@@ -51,19 +51,19 @@ ENTRY(sk_load_half_positive_offset)
 	movzwl	(SKBDATA,%rsi),%eax
 	rol	$8,%ax			# ntohs()
 	ret
-ENDPROC(sk_load_half_positive_offset)
+SYM_FUNC_END(sk_load_half_positive_offset)
 
-ENTRY(sk_load_byte)
+SYM_FUNC_START(sk_load_byte)
 	test	%esi,%esi
 	js	bpf_slow_path_byte_neg
-ENDPROC(sk_load_byte)
+SYM_FUNC_END(sk_load_byte)
 
-ENTRY(sk_load_byte_positive_offset)
+SYM_FUNC_START(sk_load_byte_positive_offset)
 	cmp	%esi,%r9d   /* if (offset >= hlen) goto bpf_slow_path_byte */
 	jle	bpf_slow_path_byte
 	movzbl	(SKBDATA,%rsi),%eax
 	ret
-ENDPROC(sk_load_byte_positive_offset)
+SYM_FUNC_END(sk_load_byte_positive_offset)
 
 /* rsi contains offset and can be scratched */
 #define bpf_slow_path_common(LEN)		\
@@ -124,36 +124,36 @@ SYM_FUNC_START_LOCAL(bpf_slow_path_word_neg)
 	jl	bpf_error	/* offset lower -> error  */
 SYM_FUNC_END(bpf_slow_path_word_neg)
 
-ENTRY(sk_load_word_negative_offset)
+SYM_FUNC_START(sk_load_word_negative_offset)
 	sk_negative_common(4)
 	mov	(%rax), %eax
 	bswap	%eax
 	ret
-ENDPROC(sk_load_word_negative_offset)
+SYM_FUNC_END(sk_load_word_negative_offset)
 
 SYM_FUNC_START_LOCAL(bpf_slow_path_half_neg)
 	cmp	SKF_MAX_NEG_OFF, %esi
 	jl	bpf_error
 SYM_FUNC_END(bpf_slow_path_half_neg)
 
-ENTRY(sk_load_half_negative_offset)
+SYM_FUNC_START(sk_load_half_negative_offset)
 	sk_negative_common(2)
 	mov	(%rax),%ax
 	rol	$8,%ax
 	movzwl	%ax,%eax
 	ret
-ENDPROC(sk_load_half_negative_offset)
+SYM_FUNC_END(sk_load_half_negative_offset)
 
 SYM_FUNC_START_LOCAL(bpf_slow_path_byte_neg)
 	cmp	SKF_MAX_NEG_OFF, %esi
 	jl	bpf_error
 SYM_FUNC_END(bpf_slow_path_byte_neg)
 
-ENTRY(sk_load_byte_negative_offset)
+SYM_FUNC_START(sk_load_byte_negative_offset)
 	sk_negative_common(1)
 	movzbl	(%rax), %eax
 	ret
-ENDPROC(sk_load_byte_negative_offset)
+SYM_FUNC_END(sk_load_byte_negative_offset)
 
 SYM_FUNC_START_LOCAL(bpf_error)
 # force a return 0 from jit handler
diff --git a/arch/x86/platform/efi/efi_stub_64.S b/arch/x86/platform/efi/efi_stub_64.S
index cd95075944ab..90936c4e396e 100644
--- a/arch/x86/platform/efi/efi_stub_64.S
+++ b/arch/x86/platform/efi/efi_stub_64.S
@@ -38,7 +38,7 @@
 	mov %rsi, %cr0;			\
 	mov (%rsp), %rsp
 
-ENTRY(efi_call)
+SYM_FUNC_START(efi_call)
 	pushq %rbp
 	movq %rsp, %rbp
 	SAVE_XMM
@@ -54,4 +54,4 @@ ENTRY(efi_call)
 	RESTORE_XMM
 	popq %rbp
 	ret
-ENDPROC(efi_call)
+SYM_FUNC_END(efi_call)
diff --git a/arch/x86/platform/efi/efi_thunk_64.S b/arch/x86/platform/efi/efi_thunk_64.S
index d18697df1fe9..012601609d81 100644
--- a/arch/x86/platform/efi/efi_thunk_64.S
+++ b/arch/x86/platform/efi/efi_thunk_64.S
@@ -24,7 +24,7 @@
 
 	.text
 	.code64
-ENTRY(efi64_thunk)
+SYM_FUNC_START(efi64_thunk)
 	push	%rbp
 	push	%rbx
 
@@ -59,7 +59,7 @@ ENTRY(efi64_thunk)
 	pop	%rbx
 	pop	%rbp
 	retq
-ENDPROC(efi64_thunk)
+SYM_FUNC_END(efi64_thunk)
 
 /*
  * We run this function from the 1:1 mapping.
diff --git a/arch/x86/platform/olpc/xo1-wakeup.S b/arch/x86/platform/olpc/xo1-wakeup.S
index 2929091cf7fd..93ba74de2c55 100644
--- a/arch/x86/platform/olpc/xo1-wakeup.S
+++ b/arch/x86/platform/olpc/xo1-wakeup.S
@@ -89,7 +89,7 @@ restore_registers:
 
 	ret
 
-ENTRY(do_olpc_suspend_lowlevel)
+SYM_FUNC_START(do_olpc_suspend_lowlevel)
 	call	save_processor_state
 	call	save_registers
 
@@ -109,7 +109,7 @@ ret_point:
 	call	restore_registers
 	call	restore_processor_state
 	ret
-ENDPROC(do_olpc_suspend_lowlevel)
+SYM_FUNC_END(do_olpc_suspend_lowlevel)
 
 .data
 saved_gdt:             .long   0,0
diff --git a/arch/x86/power/hibernate_asm_64.S b/arch/x86/power/hibernate_asm_64.S
index ec6b26fd3a7e..e836ce085691 100644
--- a/arch/x86/power/hibernate_asm_64.S
+++ b/arch/x86/power/hibernate_asm_64.S
@@ -23,7 +23,7 @@
 #include <asm/processor-flags.h>
 #include <asm/frame.h>
 
-ENTRY(swsusp_arch_suspend)
+SYM_FUNC_START(swsusp_arch_suspend)
 	movq	$saved_context, %rax
 	movq	%rsp, pt_regs_sp(%rax)
 	movq	%rbp, pt_regs_bp(%rax)
@@ -51,9 +51,9 @@ ENTRY(swsusp_arch_suspend)
 	call swsusp_save
 	FRAME_END
 	ret
-ENDPROC(swsusp_arch_suspend)
+SYM_FUNC_END(swsusp_arch_suspend)
 
-ENTRY(restore_image)
+SYM_FUNC_START(restore_image)
 	/* prepare to jump to the image kernel */
 	movq	restore_jump_address(%rip), %r8
 	movq	restore_cr3(%rip), %r9
@@ -68,10 +68,10 @@ ENTRY(restore_image)
 	/* jump to relocated restore code */
 	movq	relocated_restore_code(%rip), %rcx
 	jmpq	*%rcx
-ENDPROC(restore_image)
+SYM_FUNC_END(restore_image)
 
 	/* code below has been relocated to a safe page */
-ENTRY(core_restore_code)
+SYM_FUNC_START(core_restore_code)
 	/* switch to temporary page tables */
 	movq	%rax, %cr3
 	/* flush TLB */
@@ -99,11 +99,11 @@ ENTRY(core_restore_code)
 .Ldone:
 	/* jump to the restore_registers address from the image header */
 	jmpq	*%r8
-ENDPROC(core_restore_code)
+SYM_FUNC_END(core_restore_code)
 
 	 /* code below belongs to the image kernel */
 	.align PAGE_SIZE
-ENTRY(restore_registers)
+SYM_FUNC_START(restore_registers)
 	/* go back to the original page tables */
 	movq    %r9, %cr3
 
@@ -145,4 +145,4 @@ ENTRY(restore_registers)
 	movq	%rax, in_suspend(%rip)
 
 	ret
-ENDPROC(restore_registers)
+SYM_FUNC_END(restore_registers)
diff --git a/arch/x86/realmode/rm/reboot.S b/arch/x86/realmode/rm/reboot.S
index 370ed1fe34e4..72224849f6c1 100644
--- a/arch/x86/realmode/rm/reboot.S
+++ b/arch/x86/realmode/rm/reboot.S
@@ -18,7 +18,7 @@
  */
 	.section ".text32", "ax"
 	.code32
-ENTRY(machine_real_restart_asm)
+SYM_FUNC_START(machine_real_restart_asm)
 
 #ifdef CONFIG_X86_64
 	/* Switch to trampoline GDT as it is guaranteed < 4 GiB */
@@ -62,7 +62,7 @@ SYM_FUNC_INNER_LABEL(machine_real_restart_paging_off, SYM_V_GLOBAL)
 	movl	%ecx, %gs
 	movl	%ecx, %ss
 	ljmpw	$8, $1f
-ENDPROC(machine_real_restart_asm)
+SYM_FUNC_END(machine_real_restart_asm)
 
 /*
  * This is 16-bit protected mode code to disable paging and the cache,
diff --git a/arch/x86/realmode/rm/trampoline_64.S b/arch/x86/realmode/rm/trampoline_64.S
index f1f2f18fff85..36c0c68709b7 100644
--- a/arch/x86/realmode/rm/trampoline_64.S
+++ b/arch/x86/realmode/rm/trampoline_64.S
@@ -36,7 +36,7 @@
 	.code16
 
 	.balign	PAGE_SIZE
-ENTRY(trampoline_start)
+SYM_FUNC_START(trampoline_start)
 	cli			# We should be safe anyway
 	wbinvd
 
@@ -79,14 +79,14 @@ ENTRY(trampoline_start)
 no_longmode:
 	hlt
 	jmp no_longmode
-ENDPROC(trampoline_start)
+SYM_FUNC_END(trampoline_start)
 
 #include "../kernel/verify_cpu.S"
 
 	.section ".text32","ax"
 	.code32
 	.balign 4
-ENTRY(startup_32)
+SYM_FUNC_START(startup_32)
 	movl	%edx, %ss
 	addl	$pa_real_mode_base, %esp
 	movl	%edx, %ds
@@ -118,15 +118,15 @@ ENTRY(startup_32)
 	 * the new gdt/idt that has __KERNEL_CS with CS.L = 1.
 	 */
 	ljmpl	$__KERNEL_CS, $pa_startup_64
-ENDPROC(startup_32)
+SYM_FUNC_END(startup_32)
 
 	.section ".text64","ax"
 	.code64
 	.balign 4
-ENTRY(startup_64)
+SYM_FUNC_START(startup_64)
 	# Now jump into the kernel using virtual addresses
 	jmpq	*tr_start(%rip)
-ENDPROC(startup_64)
+SYM_FUNC_END(startup_64)
 
 	.section ".rodata","a"
 	# Duplicate the global descriptor table
diff --git a/arch/x86/realmode/rm/wakeup_asm.S b/arch/x86/realmode/rm/wakeup_asm.S
index 41483fd2d247..8501b4d17dca 100644
--- a/arch/x86/realmode/rm/wakeup_asm.S
+++ b/arch/x86/realmode/rm/wakeup_asm.S
@@ -36,7 +36,7 @@ SYM_DATA_END(wakeup_header)
 	.code16
 
 	.balign	16
-ENTRY(wakeup_start)
+SYM_FUNC_START(wakeup_start)
 	cli
 	cld
 
@@ -134,7 +134,7 @@ ENTRY(wakeup_start)
 #else
 	jmp	trampoline_start
 #endif
-ENDPROC(wakeup_start)
+SYM_FUNC_END(wakeup_start)
 
 bogus_real_magic:
 1:
diff --git a/arch/x86/xen/xen-asm.S b/arch/x86/xen/xen-asm.S
index eff224df813f..cfa05b8259ab 100644
--- a/arch/x86/xen/xen-asm.S
+++ b/arch/x86/xen/xen-asm.S
@@ -23,7 +23,7 @@
  * event status with one and operation.  If there are pending events,
  * then enter the hypervisor to get them handled.
  */
-ENTRY(xen_irq_enable_direct)
+SYM_FUNC_START(xen_irq_enable_direct)
 	FRAME_BEGIN
 	/* Unmask events */
 	movb $0, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
@@ -43,7 +43,7 @@ ENTRY(xen_irq_enable_direct)
 ENDPATCH(xen_irq_enable_direct)
 	FRAME_END
 	ret
-	ENDPROC(xen_irq_enable_direct)
+	SYM_FUNC_END(xen_irq_enable_direct)
 	RELOC(xen_irq_enable_direct, 2b+1)
 
 
@@ -51,11 +51,11 @@ ENDPATCH(xen_irq_enable_direct)
  * Disabling events is simply a matter of making the event mask
  * non-zero.
  */
-ENTRY(xen_irq_disable_direct)
+SYM_FUNC_START(xen_irq_disable_direct)
 	movb $1, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
 ENDPATCH(xen_irq_disable_direct)
 	ret
-	ENDPROC(xen_irq_disable_direct)
+	SYM_FUNC_END(xen_irq_disable_direct)
 	RELOC(xen_irq_disable_direct, 0)
 
 /*
@@ -67,13 +67,13 @@ ENDPATCH(xen_irq_disable_direct)
  * undefined.  We need to toggle the state of the bit, because Xen and
  * x86 use opposite senses (mask vs enable).
  */
-ENTRY(xen_save_fl_direct)
+SYM_FUNC_START(xen_save_fl_direct)
 	testb $0xff, PER_CPU_VAR(xen_vcpu_info) + XEN_vcpu_info_mask
 	setz %ah
 	addb %ah, %ah
 ENDPATCH(xen_save_fl_direct)
 	ret
-	ENDPROC(xen_save_fl_direct)
+	SYM_FUNC_END(xen_save_fl_direct)
 	RELOC(xen_save_fl_direct, 0)
 
 
@@ -84,7 +84,7 @@ ENDPATCH(xen_save_fl_direct)
  * interrupt mask state, it checks for unmasked pending events and
  * enters the hypervisor to get them delivered if so.
  */
-ENTRY(xen_restore_fl_direct)
+SYM_FUNC_START(xen_restore_fl_direct)
 	FRAME_BEGIN
 #ifdef CONFIG_X86_64
 	testw $X86_EFLAGS_IF, %di
@@ -106,7 +106,7 @@ ENTRY(xen_restore_fl_direct)
 ENDPATCH(xen_restore_fl_direct)
 	FRAME_END
 	ret
-	ENDPROC(xen_restore_fl_direct)
+	SYM_FUNC_END(xen_restore_fl_direct)
 	RELOC(xen_restore_fl_direct, 2b+1)
 
 
@@ -114,7 +114,7 @@ ENDPATCH(xen_restore_fl_direct)
  * Force an event check by making a hypercall, but preserve regs
  * before making the call.
  */
-ENTRY(check_events)
+SYM_FUNC_START(check_events)
 	FRAME_BEGIN
 #ifdef CONFIG_X86_32
 	push %eax
@@ -147,4 +147,4 @@ ENTRY(check_events)
 #endif
 	FRAME_END
 	ret
-ENDPROC(check_events)
+SYM_FUNC_END(check_events)
diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index e1174171ab57..a29c8a064eda 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -22,11 +22,11 @@
 
 #include "xen-asm.h"
 
-ENTRY(xen_adjust_exception_frame)
+SYM_FUNC_START(xen_adjust_exception_frame)
 	mov 8+0(%rsp), %rcx
 	mov 8+8(%rsp), %r11
 	ret $16
-ENDPROC(xen_adjust_exception_frame)
+SYM_FUNC_END(xen_adjust_exception_frame)
 
 hypercall_iret = hypercall_page + __HYPERVISOR_iret * 32
 /*
@@ -44,14 +44,14 @@ hypercall_iret = hypercall_page + __HYPERVISOR_iret * 32
  *	r11		}<-- pushed by hypercall page
  * rsp->rax		}
  */
-ENTRY(xen_iret)
+SYM_FUNC_START(xen_iret)
 	pushq $0
 1:	jmp hypercall_iret
 ENDPATCH(xen_iret)
 RELOC(xen_iret, 1b+1)
-ENDPROC(xen_iret)
+SYM_FUNC_END(xen_iret)
 
-ENTRY(xen_sysret64)
+SYM_FUNC_START(xen_sysret64)
 	/*
 	 * We're already on the usermode stack at this point, but
 	 * still with the kernel gs, so we can easily switch back
@@ -69,7 +69,7 @@ ENTRY(xen_sysret64)
 1:	jmp hypercall_iret
 ENDPATCH(xen_sysret64)
 RELOC(xen_sysret64, 1b+1)
-ENDPROC(xen_sysret64)
+SYM_FUNC_END(xen_sysret64)
 
 /*
  * Xen handles syscall callbacks much like ordinary exceptions, which
@@ -96,34 +96,34 @@ ENDPROC(xen_sysret64)
 .endm
 
 /* Normal 64-bit system call target */
-ENTRY(xen_syscall_target)
+SYM_FUNC_START(xen_syscall_target)
 	undo_xen_syscall
 	jmp entry_SYSCALL_64_after_swapgs
-ENDPROC(xen_syscall_target)
+SYM_FUNC_END(xen_syscall_target)
 
 #ifdef CONFIG_IA32_EMULATION
 
 /* 32-bit compat syscall target */
-ENTRY(xen_syscall32_target)
+SYM_FUNC_START(xen_syscall32_target)
 	undo_xen_syscall
 	jmp entry_SYSCALL_compat
-ENDPROC(xen_syscall32_target)
+SYM_FUNC_END(xen_syscall32_target)
 
 /* 32-bit compat sysenter target */
-ENTRY(xen_sysenter_target)
+SYM_FUNC_START(xen_sysenter_target)
 	undo_xen_syscall
 	jmp entry_SYSENTER_compat
-ENDPROC(xen_sysenter_target)
+SYM_FUNC_END(xen_sysenter_target)
 
 #else /* !CONFIG_IA32_EMULATION */
 
 SYM_FUNC_START_ALIAS(xen_syscall32_target)
-ENTRY(xen_sysenter_target)
+SYM_FUNC_START(xen_sysenter_target)
 	lea 16(%rsp), %rsp	/* strip %rcx, %r11 */
 	mov $-ENOSYS, %rax
 	pushq $0
 	jmp hypercall_iret
-ENDPROC(xen_sysenter_target)
+SYM_FUNC_END(xen_sysenter_target)
 SYM_FUNC_END_ALIAS(xen_syscall32_target)
 
 #endif	/* CONFIG_IA32_EMULATION */
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 95eb4978791b..5c541f422055 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -18,7 +18,7 @@
 
 #ifdef CONFIG_XEN_PV
 	__INIT
-ENTRY(startup_xen)
+SYM_FUNC_START(startup_xen)
 	cld
 
 	/* Clear .bss */
@@ -33,15 +33,15 @@ ENTRY(startup_xen)
 	mov $init_thread_union+THREAD_SIZE, %_ASM_SP
 
 	jmp xen_start_kernel
-ENDPROC(startup_xen)
+SYM_FUNC_END(startup_xen)
 	__FINIT
 #endif
 
 .pushsection .text
 	.balign PAGE_SIZE
-ENTRY(hypercall_page)
+SYM_FUNC_START(hypercall_page)
 	.skip PAGE_SIZE
-ENDPROC(hypercall_page)
+SYM_FUNC_END(hypercall_page)
 
 #define HYPERCALL(n) \
 	.equ xen_hypercall_##n, hypercall_page + __HYPERVISOR_##n * 32; \
diff --git a/include/linux/linkage.h b/include/linux/linkage.h
index 27af3543fbc9..f51928ae175b 100644
--- a/include/linux/linkage.h
+++ b/include/linux/linkage.h
@@ -99,11 +99,13 @@
 
 /* === DEPRECATED annotations === */
 
+#ifndef CONFIG_X86_64
 #ifndef ENTRY
 /* deprecated, use SYM_FUNC_START */
 #define ENTRY(name) \
 	SYM_FUNC_START(name)
 #endif
+#endif /* CONFIG_X86_64 */
 #endif /* LINKER_SCRIPT */
 
 #ifndef WEAK
@@ -120,6 +122,7 @@
 #endif
 #endif /* CONFIG_X86 */
 
+#ifndef CONFIG_X86_64
 /* If symbol 'name' is treated as a subroutine (gets called, and returns)
  * then please use ENDPROC to mark 'name' as STT_FUNC for the benefit of
  * static analysis tools such as stack depth analyzer.
@@ -129,6 +132,7 @@
 #define ENDPROC(name) \
 	SYM_FUNC_END(name)
 #endif
+#endif /* CONFIG_X86_64 */
 
 /* === generic annotations === */
 
-- 
2.12.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-21 14:12 ` [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC Jiri Slaby
@ 2017-04-21 19:32   ` Alexei Starovoitov
  2017-04-24  6:45     ` Jiri Slaby
  0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2017-04-21 19:32 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: mingo, tglx, hpa, x86, jpoimboe, linux-kernel, David S. Miller,
	netdev, Daniel Borkmann, Eric Dumazet

On Fri, Apr 21, 2017 at 04:12:43PM +0200, Jiri Slaby wrote:
> Do not use a custom macro FUNC for starts of the global functions, use
> ENTRY instead.
> 
> And while at it, annotate also ends of the functions by ENDPROC.
> 
> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> Cc: James Morris <jmorris@namei.org>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: Patrick McHardy <kaber@trash.net>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: x86@kernel.org
> Cc: netdev@vger.kernel.org
> ---
>  arch/x86/net/bpf_jit.S | 32 ++++++++++++++++++--------------
>  1 file changed, 18 insertions(+), 14 deletions(-)
> 
> diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
> index f2a7faf4706e..762c29fb8832 100644
> --- a/arch/x86/net/bpf_jit.S
> +++ b/arch/x86/net/bpf_jit.S
> @@ -23,16 +23,12 @@
>  	32 /* space for rbx,r13,r14,r15 */ + \
>  	8 /* space for skb_copy_bits */)
>  
> -#define FUNC(name) \
> -	.globl name; \
> -	.type name, @function; \
> -	name:
> -
> -FUNC(sk_load_word)
> +ENTRY(sk_load_word)
>  	test	%esi,%esi
>  	js	bpf_slow_path_word_neg
> +ENDPROC(sk_load_word)

this doens't look right.
It will add alignment nops in critical paths of these pseudo functions.
I'm also not sure whether it will still work afterwards.
Was it tested?
I'd prefer if this code kept as-is.

> -FUNC(sk_load_word_positive_offset)
> +ENTRY(sk_load_word_positive_offset)
>  	mov	%r9d,%eax		# hlen
>  	sub	%esi,%eax		# hlen - offset
>  	cmp	$3,%eax
> @@ -40,12 +36,14 @@ FUNC(sk_load_word_positive_offset)
>  	mov     (SKBDATA,%rsi),%eax
>  	bswap   %eax  			/* ntohl() */
>  	ret
> +ENDPROC(sk_load_word_positive_offset)
>  
> -FUNC(sk_load_half)
> +ENTRY(sk_load_half)
>  	test	%esi,%esi
>  	js	bpf_slow_path_half_neg
> +ENDPROC(sk_load_half)
>  
> -FUNC(sk_load_half_positive_offset)
> +ENTRY(sk_load_half_positive_offset)
>  	mov	%r9d,%eax
>  	sub	%esi,%eax		#	hlen - offset
>  	cmp	$1,%eax
> @@ -53,16 +51,19 @@ FUNC(sk_load_half_positive_offset)
>  	movzwl	(SKBDATA,%rsi),%eax
>  	rol	$8,%ax			# ntohs()
>  	ret
> +ENDPROC(sk_load_half_positive_offset)
>  
> -FUNC(sk_load_byte)
> +ENTRY(sk_load_byte)
>  	test	%esi,%esi
>  	js	bpf_slow_path_byte_neg
> +ENDPROC(sk_load_byte)
>  
> -FUNC(sk_load_byte_positive_offset)
> +ENTRY(sk_load_byte_positive_offset)
>  	cmp	%esi,%r9d   /* if (offset >= hlen) goto bpf_slow_path_byte */
>  	jle	bpf_slow_path_byte
>  	movzbl	(SKBDATA,%rsi),%eax
>  	ret
> +ENDPROC(sk_load_byte_positive_offset)
>  
>  /* rsi contains offset and can be scratched */
>  #define bpf_slow_path_common(LEN)		\
> @@ -119,31 +120,34 @@ bpf_slow_path_word_neg:
>  	cmp	SKF_MAX_NEG_OFF, %esi	/* test range */
>  	jl	bpf_error	/* offset lower -> error  */
>  
> -FUNC(sk_load_word_negative_offset)
> +ENTRY(sk_load_word_negative_offset)
>  	sk_negative_common(4)
>  	mov	(%rax), %eax
>  	bswap	%eax
>  	ret
> +ENDPROC(sk_load_word_negative_offset)
>  
>  bpf_slow_path_half_neg:
>  	cmp	SKF_MAX_NEG_OFF, %esi
>  	jl	bpf_error
>  
> -FUNC(sk_load_half_negative_offset)
> +ENTRY(sk_load_half_negative_offset)
>  	sk_negative_common(2)
>  	mov	(%rax),%ax
>  	rol	$8,%ax
>  	movzwl	%ax,%eax
>  	ret
> +ENDPROC(sk_load_half_negative_offset)
>  
>  bpf_slow_path_byte_neg:
>  	cmp	SKF_MAX_NEG_OFF, %esi
>  	jl	bpf_error
>  
> -FUNC(sk_load_byte_negative_offset)
> +ENTRY(sk_load_byte_negative_offset)
>  	sk_negative_common(1)
>  	movzbl	(%rax), %eax
>  	ret
> +ENDPROC(sk_load_byte_negative_offset)
>  
>  bpf_error:
>  # force a return 0 from jit handler
> -- 
> 2.12.2
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-21 19:32   ` Alexei Starovoitov
@ 2017-04-24  6:45     ` Jiri Slaby
  2017-04-24 14:41       ` David Miller
  0 siblings, 1 reply; 17+ messages in thread
From: Jiri Slaby @ 2017-04-24  6:45 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: mingo, tglx, hpa, x86, jpoimboe, linux-kernel, David S. Miller,
	netdev, Daniel Borkmann, Eric Dumazet

On 04/21/2017, 09:32 PM, Alexei Starovoitov wrote:
> On Fri, Apr 21, 2017 at 04:12:43PM +0200, Jiri Slaby wrote:
>> Do not use a custom macro FUNC for starts of the global functions, use
>> ENTRY instead.
>>
>> And while at it, annotate also ends of the functions by ENDPROC.
>>
>> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
>> Cc: "David S. Miller" <davem@davemloft.net>
>> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
>> Cc: James Morris <jmorris@namei.org>
>> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
>> Cc: Patrick McHardy <kaber@trash.net>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>> Cc: x86@kernel.org
>> Cc: netdev@vger.kernel.org
>> ---
>>  arch/x86/net/bpf_jit.S | 32 ++++++++++++++++++--------------
>>  1 file changed, 18 insertions(+), 14 deletions(-)
>>
>> diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
>> index f2a7faf4706e..762c29fb8832 100644
>> --- a/arch/x86/net/bpf_jit.S
>> +++ b/arch/x86/net/bpf_jit.S
>> @@ -23,16 +23,12 @@
>>  	32 /* space for rbx,r13,r14,r15 */ + \
>>  	8 /* space for skb_copy_bits */)
>>  
>> -#define FUNC(name) \
>> -	.globl name; \
>> -	.type name, @function; \
>> -	name:
>> -
>> -FUNC(sk_load_word)
>> +ENTRY(sk_load_word)
>>  	test	%esi,%esi
>>  	js	bpf_slow_path_word_neg
>> +ENDPROC(sk_load_word)
> 
> this doens't look right.
> It will add alignment nops in critical paths of these pseudo functions.
> I'm also not sure whether it will still work afterwards.
> Was it tested?
> I'd prefer if this code kept as-is.

It cannot stay as-is simply because we want to know where the functions
end to inject debuginfo properly. The code above does not warrant for
any exception.

Executing a nop takes a little and having externally-callable functions
aligned can actually help performance (no, I haven't measured nor tested
the code). But sure, the tool is generic, so I can introduce a local
macros to avoid alignments in the functions:

#define BPF_FUNC_START_LOCAL(name) \
		SYM_START(name, SYM_V_LOCAL, SYM_A_NONE)
#define BPF_FUNC_START(name) \
		SYM_START(name, SYM_V_GLOBAL, SYM_A_NONE)

#define BPF_FUNC_END(name) SYM_FUNC_END(name)

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24  6:45     ` Jiri Slaby
@ 2017-04-24 14:41       ` David Miller
  2017-04-24 14:52         ` Jiri Slaby
  0 siblings, 1 reply; 17+ messages in thread
From: David Miller @ 2017-04-24 14:41 UTC (permalink / raw)
  To: jslaby
  Cc: alexei.starovoitov, mingo, tglx, hpa, x86, jpoimboe, linux-kernel,
	netdev, daniel, edumazet

From: Jiri Slaby <jslaby@suse.cz>
Date: Mon, 24 Apr 2017 08:45:11 +0200

> On 04/21/2017, 09:32 PM, Alexei Starovoitov wrote:
>> On Fri, Apr 21, 2017 at 04:12:43PM +0200, Jiri Slaby wrote:
>>> Do not use a custom macro FUNC for starts of the global functions, use
>>> ENTRY instead.
>>>
>>> And while at it, annotate also ends of the functions by ENDPROC.
>>>
>>> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
>>> Cc: "David S. Miller" <davem@davemloft.net>
>>> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
>>> Cc: James Morris <jmorris@namei.org>
>>> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
>>> Cc: Patrick McHardy <kaber@trash.net>
>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>> Cc: Ingo Molnar <mingo@redhat.com>
>>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>>> Cc: x86@kernel.org
>>> Cc: netdev@vger.kernel.org
>>> ---
>>>  arch/x86/net/bpf_jit.S | 32 ++++++++++++++++++--------------
>>>  1 file changed, 18 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/arch/x86/net/bpf_jit.S b/arch/x86/net/bpf_jit.S
>>> index f2a7faf4706e..762c29fb8832 100644
>>> --- a/arch/x86/net/bpf_jit.S
>>> +++ b/arch/x86/net/bpf_jit.S
>>> @@ -23,16 +23,12 @@
>>>  	32 /* space for rbx,r13,r14,r15 */ + \
>>>  	8 /* space for skb_copy_bits */)
>>>  
>>> -#define FUNC(name) \
>>> -	.globl name; \
>>> -	.type name, @function; \
>>> -	name:
>>> -
>>> -FUNC(sk_load_word)
>>> +ENTRY(sk_load_word)
>>>  	test	%esi,%esi
>>>  	js	bpf_slow_path_word_neg
>>> +ENDPROC(sk_load_word)
>> 
>> this doens't look right.
>> It will add alignment nops in critical paths of these pseudo functions.
>> I'm also not sure whether it will still work afterwards.
>> Was it tested?
>> I'd prefer if this code kept as-is.
> 
> It cannot stay as-is simply because we want to know where the functions
> end to inject debuginfo properly. The code above does not warrant for
> any exception.

I totally and completely disagree.

> Executing a nop takes a little and having externally-callable functions
> aligned can actually help performance (no, I haven't measured nor tested
> the code). But sure, the tool is generic, so I can introduce a local
> macros to avoid alignments in the functions:

Not for this case, it's a bunch of entry points all packed together
intentionally so that SKB accesses of different access sizes (which is
almost always the case) from BPF programs use the smallest amount of
I-cache as possible.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 14:41       ` David Miller
@ 2017-04-24 14:52         ` Jiri Slaby
  2017-04-24 15:08           ` David Miller
  0 siblings, 1 reply; 17+ messages in thread
From: Jiri Slaby @ 2017-04-24 14:52 UTC (permalink / raw)
  To: David Miller
  Cc: alexei.starovoitov, mingo, tglx, hpa, x86, jpoimboe, linux-kernel,
	netdev, daniel, edumazet

On 04/24/2017, 04:41 PM, David Miller wrote:
>> It cannot stay as-is simply because we want to know where the functions
>> end to inject debuginfo properly. The code above does not warrant for
>> any exception.
> 
> I totally and completely disagree.

You can disagree as you wish but there is really nothing special on the
bpf code with respect to annotations.

>> Executing a nop takes a little and having externally-callable functions
>> aligned can actually help performance (no, I haven't measured nor tested
>> the code). But sure, the tool is generic, so I can introduce a local
>> macros to avoid alignments in the functions:
> 
> Not for this case, it's a bunch of entry points all packed together
> intentionally so that SKB accesses of different access sizes (which is
> almost always the case) from BPF programs use the smallest amount of
> I-cache as possible.

And for that reason I suggested the special macros for the code (see the
macros in the e-mail you replied to again). So what problem do you
actually have with the suggested solution?

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 14:52         ` Jiri Slaby
@ 2017-04-24 15:08           ` David Miller
  2017-04-24 15:41             ` Jiri Slaby
  0 siblings, 1 reply; 17+ messages in thread
From: David Miller @ 2017-04-24 15:08 UTC (permalink / raw)
  To: jslaby
  Cc: alexei.starovoitov, mingo, tglx, hpa, x86, jpoimboe, linux-kernel,
	netdev, daniel, edumazet

From: Jiri Slaby <jslaby@suse.cz>
Date: Mon, 24 Apr 2017 16:52:43 +0200

> On 04/24/2017, 04:41 PM, David Miller wrote:
>>> It cannot stay as-is simply because we want to know where the functions
>>> end to inject debuginfo properly. The code above does not warrant for
>>> any exception.
>> 
>> I totally and completely disagree.
> 
> You can disagree as you wish but there is really nothing special on the
> bpf code with respect to annotations.
> 
>>> Executing a nop takes a little and having externally-callable functions
>>> aligned can actually help performance (no, I haven't measured nor tested
>>> the code). But sure, the tool is generic, so I can introduce a local
>>> macros to avoid alignments in the functions:
>> 
>> Not for this case, it's a bunch of entry points all packed together
>> intentionally so that SKB accesses of different access sizes (which is
>> almost always the case) from BPF programs use the smallest amount of
>> I-cache as possible.
> 
> And for that reason I suggested the special macros for the code (see the
> macros in the e-mail you replied to again). So what problem do you
> actually have with the suggested solution?

If you align the entry points, then the code sequence as a whole is
are no longer densely packed.

Or do I misunderstand how your macros work?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 15:08           ` David Miller
@ 2017-04-24 15:41             ` Jiri Slaby
  2017-04-24 15:51               ` David Miller
  2017-04-24 15:55               ` Ingo Molnar
  0 siblings, 2 replies; 17+ messages in thread
From: Jiri Slaby @ 2017-04-24 15:41 UTC (permalink / raw)
  To: David Miller
  Cc: alexei.starovoitov, mingo, tglx, hpa, x86, jpoimboe, linux-kernel,
	netdev, daniel, edumazet

On 04/24/2017, 05:08 PM, David Miller wrote:
> If you align the entry points, then the code sequence as a whole is
> are no longer densely packed.

Sure.

> Or do I misunderstand how your macros work?

Perhaps. So the suggested macros for the code are:
#define BPF_FUNC_START_LOCAL(name) \
		SYM_START(name, SYM_V_LOCAL, SYM_A_NONE)
#define BPF_FUNC_START(name) \
		SYM_START(name, SYM_V_GLOBAL, SYM_A_NONE)

and they differ from the standard ones:
#define SYM_FUNC_START_LOCAL(name)                      \
        SYM_START(name, SYM_V_LOCAL, SYM_A_ALIGN)
#define SYM_FUNC_START(name)                            \
        SYM_START(name, SYM_V_GLOBAL, SYM_A_ALIGN)


The difference is SYM_A_NONE vs. SYM_A_ALIGN, which means:
#define SYM_A_ALIGN                             ALIGN
#define SYM_A_NONE                              /* nothing */

Does it look OK now?

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 15:41             ` Jiri Slaby
@ 2017-04-24 15:51               ` David Miller
  2017-04-24 15:53                 ` Jiri Slaby
  2017-04-24 15:55               ` Ingo Molnar
  1 sibling, 1 reply; 17+ messages in thread
From: David Miller @ 2017-04-24 15:51 UTC (permalink / raw)
  To: jslaby
  Cc: alexei.starovoitov, mingo, tglx, hpa, x86, jpoimboe, linux-kernel,
	netdev, daniel, edumazet

From: Jiri Slaby <jslaby@suse.cz>
Date: Mon, 24 Apr 2017 17:41:06 +0200

> On 04/24/2017, 05:08 PM, David Miller wrote:
>> If you align the entry points, then the code sequence as a whole is
>> are no longer densely packed.
> 
> Sure.
> 
>> Or do I misunderstand how your macros work?
> 
> Perhaps. So the suggested macros for the code are:
> #define BPF_FUNC_START_LOCAL(name) \
> 		SYM_START(name, SYM_V_LOCAL, SYM_A_NONE)
> #define BPF_FUNC_START(name) \
> 		SYM_START(name, SYM_V_GLOBAL, SYM_A_NONE)
> 
> and they differ from the standard ones:
> #define SYM_FUNC_START_LOCAL(name)                      \
>         SYM_START(name, SYM_V_LOCAL, SYM_A_ALIGN)
> #define SYM_FUNC_START(name)                            \
>         SYM_START(name, SYM_V_GLOBAL, SYM_A_ALIGN)
> 
> 
> The difference is SYM_A_NONE vs. SYM_A_ALIGN, which means:
> #define SYM_A_ALIGN                             ALIGN
> #define SYM_A_NONE                              /* nothing */
> 
> Does it look OK now?

I said I'm not OK with the alignment, so personally I am not
with how these macros work and what they will do to the code
generated for BPF packet accesses.

But I'll defer to Alexei on this because I don't have the time
nor the energy to fight this.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 15:51               ` David Miller
@ 2017-04-24 15:53                 ` Jiri Slaby
  0 siblings, 0 replies; 17+ messages in thread
From: Jiri Slaby @ 2017-04-24 15:53 UTC (permalink / raw)
  To: David Miller
  Cc: alexei.starovoitov, mingo, tglx, hpa, x86, jpoimboe, linux-kernel,
	netdev, daniel, edumazet

On 04/24/2017, 05:51 PM, David Miller wrote:
> I said I'm not OK with the alignment

So in short, the suggested macros add no alignment.

-- 
js
suse labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 15:41             ` Jiri Slaby
  2017-04-24 15:51               ` David Miller
@ 2017-04-24 15:55               ` Ingo Molnar
  2017-04-24 16:02                 ` Jiri Slaby
  1 sibling, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2017-04-24 15:55 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: David Miller, alexei.starovoitov, mingo, tglx, hpa, x86, jpoimboe,
	linux-kernel, netdev, daniel, edumazet


* Jiri Slaby <jslaby@suse.cz> wrote:

> On 04/24/2017, 05:08 PM, David Miller wrote:
> > If you align the entry points, then the code sequence as a whole is
> > are no longer densely packed.
> 
> Sure.
> 
> > Or do I misunderstand how your macros work?
> 
> Perhaps. So the suggested macros for the code are:
> #define BPF_FUNC_START_LOCAL(name) \
> 		SYM_START(name, SYM_V_LOCAL, SYM_A_NONE)
> #define BPF_FUNC_START(name) \
> 		SYM_START(name, SYM_V_GLOBAL, SYM_A_NONE)
> 
> and they differ from the standard ones:
> #define SYM_FUNC_START_LOCAL(name)                      \
>         SYM_START(name, SYM_V_LOCAL, SYM_A_ALIGN)
> #define SYM_FUNC_START(name)                            \
>         SYM_START(name, SYM_V_GLOBAL, SYM_A_ALIGN)
> 
> 
> The difference is SYM_A_NONE vs. SYM_A_ALIGN, which means:
> #define SYM_A_ALIGN                             ALIGN
> #define SYM_A_NONE                              /* nothing */
> 
> Does it look OK now?

No, the patch changes alignment which is undesirable, it needs to preserve the 
existing (non-)alignment of the symbols!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 15:55               ` Ingo Molnar
@ 2017-04-24 16:02                 ` Jiri Slaby
  2017-04-24 16:40                   ` Ingo Molnar
  2017-04-24 16:47                   ` Alexei Starovoitov
  0 siblings, 2 replies; 17+ messages in thread
From: Jiri Slaby @ 2017-04-24 16:02 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: David Miller, alexei.starovoitov, mingo, tglx, hpa, x86, jpoimboe,
	linux-kernel, netdev, daniel, edumazet

On 04/24/2017, 05:55 PM, Ingo Molnar wrote:
> * Jiri Slaby <jslaby@suse.cz> wrote:
> 
>> On 04/24/2017, 05:08 PM, David Miller wrote:
>>> If you align the entry points, then the code sequence as a whole is
>>> are no longer densely packed.
>>
>> Sure.
>>
>>> Or do I misunderstand how your macros work?
>>
>> Perhaps. So the suggested macros for the code are:
>> #define BPF_FUNC_START_LOCAL(name) \
>> 		SYM_START(name, SYM_V_LOCAL, SYM_A_NONE)
>> #define BPF_FUNC_START(name) \
>> 		SYM_START(name, SYM_V_GLOBAL, SYM_A_NONE)
>>
>> and they differ from the standard ones:
>> #define SYM_FUNC_START_LOCAL(name)                      \
>>         SYM_START(name, SYM_V_LOCAL, SYM_A_ALIGN)
>> #define SYM_FUNC_START(name)                            \
>>         SYM_START(name, SYM_V_GLOBAL, SYM_A_ALIGN)
>>
>>
>> The difference is SYM_A_NONE vs. SYM_A_ALIGN, which means:
>> #define SYM_A_ALIGN                             ALIGN
>> #define SYM_A_NONE                              /* nothing */
>>
>> Does it look OK now?
> 
> No, the patch changes alignment which is undesirable, it needs to preserve the 
> existing (non-)alignment of the symbols!

OK, so I am not expressing myself explicitly enough, it seems.

So, correct, the patch v3 adds alignments. I suggested in the discussion
the macros above. They do not add alignments. If everybody is OK with
that, v4 of the patch won't add alignments. OK?

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 16:02                 ` Jiri Slaby
@ 2017-04-24 16:40                   ` Ingo Molnar
  2017-04-24 16:47                   ` Alexei Starovoitov
  1 sibling, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2017-04-24 16:40 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: David Miller, alexei.starovoitov, mingo, tglx, hpa, x86, jpoimboe,
	linux-kernel, netdev, daniel, edumazet


* Jiri Slaby <jslaby@suse.cz> wrote:

> On 04/24/2017, 05:55 PM, Ingo Molnar wrote:
> > * Jiri Slaby <jslaby@suse.cz> wrote:
> > 
> >> On 04/24/2017, 05:08 PM, David Miller wrote:
> >>> If you align the entry points, then the code sequence as a whole is
> >>> are no longer densely packed.
> >>
> >> Sure.
> >>
> >>> Or do I misunderstand how your macros work?
> >>
> >> Perhaps. So the suggested macros for the code are:
> >> #define BPF_FUNC_START_LOCAL(name) \
> >> 		SYM_START(name, SYM_V_LOCAL, SYM_A_NONE)
> >> #define BPF_FUNC_START(name) \
> >> 		SYM_START(name, SYM_V_GLOBAL, SYM_A_NONE)
> >>
> >> and they differ from the standard ones:
> >> #define SYM_FUNC_START_LOCAL(name)                      \
> >>         SYM_START(name, SYM_V_LOCAL, SYM_A_ALIGN)
> >> #define SYM_FUNC_START(name)                            \
> >>         SYM_START(name, SYM_V_GLOBAL, SYM_A_ALIGN)
> >>
> >>
> >> The difference is SYM_A_NONE vs. SYM_A_ALIGN, which means:
> >> #define SYM_A_ALIGN                             ALIGN
> >> #define SYM_A_NONE                              /* nothing */
> >>
> >> Does it look OK now?
> > 
> > No, the patch changes alignment which is undesirable, it needs to preserve the 
> > existing (non-)alignment of the symbols!
> 
> OK, so I am not expressing myself explicitly enough, it seems.
> 
> So, correct, the patch v3 adds alignments. I suggested in the discussion
> the macros above. They do not add alignments. If everybody is OK with
> that, v4 of the patch won't add alignments. OK?

Yes.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 16:02                 ` Jiri Slaby
  2017-04-24 16:40                   ` Ingo Molnar
@ 2017-04-24 16:47                   ` Alexei Starovoitov
  2017-04-24 17:51                     ` Jiri Slaby
  1 sibling, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2017-04-24 16:47 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Ingo Molnar, David Miller, mingo, tglx, hpa, x86, jpoimboe,
	linux-kernel, netdev, daniel, edumazet

On Mon, Apr 24, 2017 at 06:02:51PM +0200, Jiri Slaby wrote:
> On 04/24/2017, 05:55 PM, Ingo Molnar wrote:
> > * Jiri Slaby <jslaby@suse.cz> wrote:
> > 
> >> On 04/24/2017, 05:08 PM, David Miller wrote:
> >>> If you align the entry points, then the code sequence as a whole is
> >>> are no longer densely packed.
> >>
> >> Sure.
> >>
> >>> Or do I misunderstand how your macros work?
> >>
> >> Perhaps. So the suggested macros for the code are:
> >> #define BPF_FUNC_START_LOCAL(name) \
> >> 		SYM_START(name, SYM_V_LOCAL, SYM_A_NONE)
> >> #define BPF_FUNC_START(name) \
> >> 		SYM_START(name, SYM_V_GLOBAL, SYM_A_NONE)
> >>
> >> and they differ from the standard ones:
> >> #define SYM_FUNC_START_LOCAL(name)                      \
> >>         SYM_START(name, SYM_V_LOCAL, SYM_A_ALIGN)
> >> #define SYM_FUNC_START(name)                            \
> >>         SYM_START(name, SYM_V_GLOBAL, SYM_A_ALIGN)
> >>
> >>
> >> The difference is SYM_A_NONE vs. SYM_A_ALIGN, which means:
> >> #define SYM_A_ALIGN                             ALIGN
> >> #define SYM_A_NONE                              /* nothing */
> >>
> >> Does it look OK now?
> > 
> > No, the patch changes alignment which is undesirable, it needs to preserve the 
> > existing (non-)alignment of the symbols!
> 
> OK, so I am not expressing myself explicitly enough, it seems.
> 
> So, correct, the patch v3 adds alignments. I suggested in the discussion
> the macros above. They do not add alignments. If everybody is OK with
> that, v4 of the patch won't add alignments. OK?

can we go back to what problem this patch set is trying to solve?
Sounds like you want to add _function_ start/end marks to aid debugging?
Debugging with what? What tool will recognize this stuff?

Take a look at what your patch does:
+ENTRY(sk_load_word)
        test    %esi,%esi
        js      bpf_slow_path_word_neg
+ENDPROC(sk_load_word)

Does above two assembler instructions look like a function?

or this:
+ENTRY(sk_load_byte_positive_offset)
        cmp     %esi,%r9d   /* if (offset >= hlen) goto bpf_slow_path_byte */
        jle     bpf_slow_path_byte
        movzbl  (SKBDATA,%rsi),%eax
        ret
+ENDPROC(sk_load_byte_positive_offset)

This assembler code doesn't represent functions. There is no prologue/epilogue
and no stack frame. JITed code uses 'call' insn to jump into them, but they're
not your typical C functions.
Take a look at bpf_slow_path_common() macro that creates the frame before
calling into C code with 'call skb_copy_bits;'

I still think that this code should be left alone.
Even macro names you're proposing:
 #define BPF_FUNC_START_LOCAL
don't sound right. These are not functions.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 16:47                   ` Alexei Starovoitov
@ 2017-04-24 17:51                     ` Jiri Slaby
  2017-04-24 18:24                       ` David Miller
  0 siblings, 1 reply; 17+ messages in thread
From: Jiri Slaby @ 2017-04-24 17:51 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Ingo Molnar, David Miller, mingo, tglx, hpa, x86, jpoimboe,
	linux-kernel, netdev, daniel, edumazet

On 04/24/2017, 06:47 PM, Alexei Starovoitov wrote:
> On Mon, Apr 24, 2017 at 06:02:51PM +0200, Jiri Slaby wrote:
>> On 04/24/2017, 05:55 PM, Ingo Molnar wrote:
>>> * Jiri Slaby <jslaby@suse.cz> wrote:
>>>
>>>> On 04/24/2017, 05:08 PM, David Miller wrote:
>>>>> If you align the entry points, then the code sequence as a whole is
>>>>> are no longer densely packed.
>>>>
>>>> Sure.
>>>>
>>>>> Or do I misunderstand how your macros work?
>>>>
>>>> Perhaps. So the suggested macros for the code are:
>>>> #define BPF_FUNC_START_LOCAL(name) \
>>>> 		SYM_START(name, SYM_V_LOCAL, SYM_A_NONE)
>>>> #define BPF_FUNC_START(name) \
>>>> 		SYM_START(name, SYM_V_GLOBAL, SYM_A_NONE)
>>>>
>>>> and they differ from the standard ones:
>>>> #define SYM_FUNC_START_LOCAL(name)                      \
>>>>         SYM_START(name, SYM_V_LOCAL, SYM_A_ALIGN)
>>>> #define SYM_FUNC_START(name)                            \
>>>>         SYM_START(name, SYM_V_GLOBAL, SYM_A_ALIGN)
>>>>
>>>>
>>>> The difference is SYM_A_NONE vs. SYM_A_ALIGN, which means:
>>>> #define SYM_A_ALIGN                             ALIGN
>>>> #define SYM_A_NONE                              /* nothing */
>>>>
>>>> Does it look OK now?
>>>
>>> No, the patch changes alignment which is undesirable, it needs to preserve the 
>>> existing (non-)alignment of the symbols!
>>
>> OK, so I am not expressing myself explicitly enough, it seems.
>>
>> So, correct, the patch v3 adds alignments. I suggested in the discussion
>> the macros above. They do not add alignments. If everybody is OK with
>> that, v4 of the patch won't add alignments. OK?
> 
> can we go back to what problem this patch set is trying to solve?
> Sounds like you want to add _function_ start/end marks to aid debugging?
> Debugging with what? What tool will recognize this stuff?

objtool will generate DWARF debuginfo between every ENTRY and ENDPROC
(dubbed differently at the end of the series). DWARF is understood by
everything, including the kernel proper (we have DWARF unwinder in
SUSE's kernels available for decades).

> Take a look at what your patch does:
> +ENTRY(sk_load_word)
>         test    %esi,%esi
>         js      bpf_slow_path_word_neg
> +ENDPROC(sk_load_word)
> 
> Does above two assembler instructions look like a function?

Yes, you are right, the code is complete mess.

For example what's the point of making the sk_load_word_positive_offset
label a global, callable function? Note that this is exactly the reason
why this particular two hunks look weird to you even though the
annotations only mechanically paraphrase what is in the current code.

> or this:
> +ENTRY(sk_load_byte_positive_offset)
>         cmp     %esi,%r9d   /* if (offset >= hlen) goto bpf_slow_path_byte */
>         jle     bpf_slow_path_byte
>         movzbl  (SKBDATA,%rsi),%eax
>         ret
> +ENDPROC(sk_load_byte_positive_offset)
> 
> This assembler code doesn't represent functions. There is no prologue/epilogue
> and no stack frame. JITed code uses 'call' insn to jump into them, but they're
> not your typical C functions.

I am not looking for C functions everywhere in assembly, actually. (In
the ideal assembly, I would and in most cases I really am.) But you are
right the annotations of the current code aren't right. It results in my
annotations being wrong too -- based on invalid assumptions.

> Take a look at bpf_slow_path_common() macro that creates the frame before
> calling into C code with 'call skb_copy_bits;'
> 
> I still think that this code should be left alone.
> Even macro names you're proposing:
>  #define BPF_FUNC_START_LOCAL
> don't sound right. These are not functions.

Well, what is the reason to call them FUNC in the current code then?

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 17:51                     ` Jiri Slaby
@ 2017-04-24 18:24                       ` David Miller
  2017-04-25 14:41                         ` Jiri Slaby
  0 siblings, 1 reply; 17+ messages in thread
From: David Miller @ 2017-04-24 18:24 UTC (permalink / raw)
  To: jslaby
  Cc: alexei.starovoitov, mingo, mingo, tglx, hpa, x86, jpoimboe,
	linux-kernel, netdev, daniel, edumazet

From: Jiri Slaby <jslaby@suse.cz>
Date: Mon, 24 Apr 2017 19:51:54 +0200

> For example what's the point of making the sk_load_word_positive_offset
> label a global, callable function? Note that this is exactly the reason
> why this particular two hunks look weird to you even though the
> annotations only mechanically paraphrase what is in the current code.

So that it can be referenced by the eBPF JIT, because these are
helpers for eBPF JIT generated code.  Every architecture implementing
an eBPF JIT has this "mess".

You can't even put a tracepoint or kprobe on these things and expect
to see "arguments" or "return PC" values in the usual spots.  This
code has special calling conventions and register usage as Alexei
explained.

I would suggest that you read and understand how this assembler is
designed, how it is called from the generated JIT code, and what it's
semantics and register usage are, before trying to annotating it.

Thank you.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC
  2017-04-24 18:24                       ` David Miller
@ 2017-04-25 14:41                         ` Jiri Slaby
  0 siblings, 0 replies; 17+ messages in thread
From: Jiri Slaby @ 2017-04-25 14:41 UTC (permalink / raw)
  To: David Miller
  Cc: alexei.starovoitov, mingo, mingo, tglx, hpa, x86, jpoimboe,
	linux-kernel, netdev, daniel, edumazet

On 04/24/2017, 08:24 PM, David Miller wrote:
> From: Jiri Slaby <jslaby@suse.cz>
> Date: Mon, 24 Apr 2017 19:51:54 +0200
> 
>> For example what's the point of making the sk_load_word_positive_offset
>> label a global, callable function? Note that this is exactly the reason
>> why this particular two hunks look weird to you even though the
>> annotations only mechanically paraphrase what is in the current code.
> 
> So that it can be referenced by the eBPF JIT, because these are
> helpers for eBPF JIT generated code.  Every architecture implementing
> an eBPF JIT has this "mess".

I completely understand the needs for this, but I am complaining about
the way it is written. That is not the best -- unbalanced annotations, C
macros in lowercase (apart from that, C macros in .S need semicolons &
backslashes), FUNC macro, etc.

> You can't even put a tracepoint or kprobe on these things and expect
> to see "arguments" or "return PC" values in the usual spots.  This
> code has special calling conventions and register usage as Alexei
> explained.

Yes, I can see that.

> I would suggest that you read and understand how this assembler is
> designed, how it is called from the generated JIT code, and what it's
> semantics and register usage are, before trying to annotating it.

Of course I studied the code. I only missed macro CHOOSE_LOAD_FUNC which
I see now. So that answers why sk_load_word_positive_offset & similar
are marked as .globl.


But the original question I asked still remains: why do you mind calling
them BPF_FUNC_START & *_END, given:

1) the functions are marked by "FUNC" already:
$ git grep FUNC linus/master arch/x86/net/bpf_jit.S
linus/master:arch/x86/net/bpf_jit.S:#define FUNC(name) \
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_word)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_word_positive_offset)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_half)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_half_positive_offset)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_byte)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_byte_positive_offset)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_word_negative_offset)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_half_negative_offset)
linus/master:arch/x86/net/bpf_jit.S:FUNC(sk_load_byte_negative_offset)

2) they _are_ all callable from within the JIT code:
EMIT1_off32(0xE8, jmp_offset);

Yes, I fucked up the ENDs. They should be on different locations. But
the pieces are still functions from my POV and should be annotated
accordingly.

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-04-25 14:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20170421141305.25180-1-jslaby@suse.cz>
2017-04-21 14:12 ` [PATCH v3 07/29] x86: bpf_jit, use ENTRY+ENDPROC Jiri Slaby
2017-04-21 19:32   ` Alexei Starovoitov
2017-04-24  6:45     ` Jiri Slaby
2017-04-24 14:41       ` David Miller
2017-04-24 14:52         ` Jiri Slaby
2017-04-24 15:08           ` David Miller
2017-04-24 15:41             ` Jiri Slaby
2017-04-24 15:51               ` David Miller
2017-04-24 15:53                 ` Jiri Slaby
2017-04-24 15:55               ` Ingo Molnar
2017-04-24 16:02                 ` Jiri Slaby
2017-04-24 16:40                   ` Ingo Molnar
2017-04-24 16:47                   ` Alexei Starovoitov
2017-04-24 17:51                     ` Jiri Slaby
2017-04-24 18:24                       ` David Miller
2017-04-25 14:41                         ` Jiri Slaby
2017-04-21 14:13 ` [PATCH v3 26/29] x86_64: assembly, change all ENTRY to SYM_FUNC_START Jiri Slaby

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).