* [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32
@ 2021-04-12 16:26 Christophe Leroy
2021-04-12 16:26 ` [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto Christophe Leroy
2021-06-25 14:41 ` [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy
0 siblings, 2 replies; 5+ messages in thread
From: Christophe Leroy @ 2021-04-12 16:26 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman
Cc: linuxppc-dev, linux-kernel
powerpc BUG_ON() and WARN_ON() are based on using twnei instruction.
For catching simple conditions like a variable having value 0, this
is efficient because it does the test and the trap at the same time.
But most conditions used with BUG_ON or WARN_ON are more complex and
forces GCC to format the condition into a 0 or 1 value in a register.
This will usually require 2 to 3 instructions.
The most efficient solution would be to use __builtin_trap() because
GCC is able to optimise the use of the different trap instructions
based on the requested condition, but this is complex if not
impossible for the following reasons:
- __builtin_trap() is a non-recoverable instruction, so it can't be
used for WARN_ON
- Knowing which line of code generated the trap would require the
analysis of DWARF information. This is not a feature we have today.
As mentioned in commit 8d4fbcfbe0a4 ("Fix WARN_ON() on bitfield ops")
the way WARN_ON() is implemented is suboptimal. That commit also
mentions an issue with 'long long' condition. It fixed it for
WARN_ON() but the same problem still exists today with BUG_ON() on
PPC32. It will be fixed by using the generic implementation.
By using the generic implementation, gcc will naturally generate a
branch to the unconditional trap generated by BUG().
As modern powerpc implement zero-cycle branch,
that's even more efficient.
And for the functions using WARN_ON() and its return, the test
on return from WARN_ON() is now also used for the WARN_ON() itself.
On PPC64 we don't want it because we want to be able to use CFAR
register to track how we entered the code that trapped. The CFAR
register would be clobbered by the branch.
A simple test function:
unsigned long test9w(unsigned long a, unsigned long b)
{
if (WARN_ON(!b))
return 0;
return a / b;
}
Before the patch:
0000046c <test9w>:
46c: 7c 89 00 34 cntlzw r9,r4
470: 55 29 d9 7e rlwinm r9,r9,27,5,31
474: 0f 09 00 00 twnei r9,0
478: 2c 04 00 00 cmpwi r4,0
47c: 41 82 00 0c beq 488 <test9w+0x1c>
480: 7c 63 23 96 divwu r3,r3,r4
484: 4e 80 00 20 blr
488: 38 60 00 00 li r3,0
48c: 4e 80 00 20 blr
After the patch:
00000468 <test9w>:
468: 2c 04 00 00 cmpwi r4,0
46c: 41 82 00 0c beq 478 <test9w+0x10>
470: 7c 63 23 96 divwu r3,r3,r4
474: 4e 80 00 20 blr
478: 0f e0 00 00 twui r0,0
47c: 38 60 00 00 li r3,0
480: 4e 80 00 20 blr
So we see before the patch we need 3 instructions on the likely path
to handle the WARN_ON(). With the patch the trap goes on the unlikely
path.
See below the difference at the entry of system_call_exception where
we have several BUG_ON(), allthough less impressing.
With the patch:
00000000 <system_call_exception>:
0: 81 6a 00 84 lwz r11,132(r10)
4: 90 6a 00 88 stw r3,136(r10)
8: 71 60 00 02 andi. r0,r11,2
c: 41 82 00 70 beq 7c <system_call_exception+0x7c>
10: 71 60 40 00 andi. r0,r11,16384
14: 41 82 00 6c beq 80 <system_call_exception+0x80>
18: 71 6b 80 00 andi. r11,r11,32768
1c: 41 82 00 68 beq 84 <system_call_exception+0x84>
20: 94 21 ff e0 stwu r1,-32(r1)
24: 93 e1 00 1c stw r31,28(r1)
28: 7d 8c 42 e6 mftb r12
...
7c: 0f e0 00 00 twui r0,0
80: 0f e0 00 00 twui r0,0
84: 0f e0 00 00 twui r0,0
Without the patch:
00000000 <system_call_exception>:
0: 94 21 ff e0 stwu r1,-32(r1)
4: 93 e1 00 1c stw r31,28(r1)
8: 90 6a 00 88 stw r3,136(r10)
c: 81 6a 00 84 lwz r11,132(r10)
10: 69 60 00 02 xori r0,r11,2
14: 54 00 ff fe rlwinm r0,r0,31,31,31
18: 0f 00 00 00 twnei r0,0
1c: 69 60 40 00 xori r0,r11,16384
20: 54 00 97 fe rlwinm r0,r0,18,31,31
24: 0f 00 00 00 twnei r0,0
28: 69 6b 80 00 xori r11,r11,32768
2c: 55 6b 8f fe rlwinm r11,r11,17,31,31
30: 0f 0b 00 00 twnei r11,0
34: 7d 8c 42 e6 mftb r12
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
---
arch/powerpc/include/asm/bug.h | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h
index d1635ffbb179..101dea4eec8d 100644
--- a/arch/powerpc/include/asm/bug.h
+++ b/arch/powerpc/include/asm/bug.h
@@ -68,7 +68,11 @@
BUG_ENTRY("twi 31, 0, 0", 0); \
unreachable(); \
} while (0)
+#define HAVE_ARCH_BUG
+
+#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags))
+#ifdef CONFIG_PPC64
#define BUG_ON(x) do { \
if (__builtin_constant_p(x)) { \
if (x) \
@@ -78,8 +82,6 @@
} \
} while (0)
-#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags))
-
#define WARN_ON(x) ({ \
int __ret_warn_on = !!(x); \
if (__builtin_constant_p(__ret_warn_on)) { \
@@ -93,9 +95,10 @@
unlikely(__ret_warn_on); \
})
-#define HAVE_ARCH_BUG
#define HAVE_ARCH_BUG_ON
#define HAVE_ARCH_WARN_ON
+#endif
+
#endif /* __ASSEMBLY __ */
#else
#ifdef __ASSEMBLY__
--
2.25.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto 2021-04-12 16:26 [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy @ 2021-04-12 16:26 ` Christophe Leroy 2021-04-12 20:40 ` kernel test robot 2021-06-25 14:41 ` [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy 1 sibling, 1 reply; 5+ messages in thread From: Christophe Leroy @ 2021-04-12 16:26 UTC (permalink / raw) To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman Cc: linuxppc-dev, linux-kernel Using asm goto in __WARN_FLAGS() and WARN_ON() allows more flexibility to GCC. For that add an entry to the exception table so that program_check_exception() knowns where to resume execution after a WARNING. Here are two exemples. The first one is done on PPC32 (which benefits from the previous patch), the second is on PPC64. unsigned long test(struct pt_regs *regs) { int ret; WARN_ON(regs->msr & MSR_PR); return regs->gpr[3]; } unsigned long test9w(unsigned long a, unsigned long b) { if (WARN_ON(!b)) return 0; return a / b; } Before the patch: 000003a8 <test>: 3a8: 81 23 00 84 lwz r9,132(r3) 3ac: 71 29 40 00 andi. r9,r9,16384 3b0: 40 82 00 0c bne 3bc <test+0x14> 3b4: 80 63 00 0c lwz r3,12(r3) 3b8: 4e 80 00 20 blr 3bc: 0f e0 00 00 twui r0,0 3c0: 80 63 00 0c lwz r3,12(r3) 3c4: 4e 80 00 20 blr 0000000000000bf0 <.test9w>: bf0: 7c 89 00 74 cntlzd r9,r4 bf4: 79 29 d1 82 rldicl r9,r9,58,6 bf8: 0b 09 00 00 tdnei r9,0 bfc: 2c 24 00 00 cmpdi r4,0 c00: 41 82 00 0c beq c0c <.test9w+0x1c> c04: 7c 63 23 92 divdu r3,r3,r4 c08: 4e 80 00 20 blr c0c: 38 60 00 00 li r3,0 c10: 4e 80 00 20 blr After the patch: 000003a8 <test>: 3a8: 81 23 00 84 lwz r9,132(r3) 3ac: 71 29 40 00 andi. r9,r9,16384 3b0: 40 82 00 0c bne 3bc <test+0x14> 3b4: 80 63 00 0c lwz r3,12(r3) 3b8: 4e 80 00 20 blr 3bc: 0f e0 00 00 twui r0,0 0000000000000c50 <.test9w>: c50: 7c 89 00 74 cntlzd r9,r4 c54: 79 29 d1 82 rldicl r9,r9,58,6 c58: 0b 09 00 00 tdnei r9,0 c5c: 7c 63 23 92 divdu r3,r3,r4 c60: 4e 80 00 20 blr c70: 38 60 00 00 li r3,0 c74: 4e 80 00 20 blr In the first exemple, we see GCC doesn't need to duplicate what happens after the trap. In the second exemple, we see that GCC doesn't need to emit a test and a branch in the likely path in addition to the trap. We've got some WARN_ON() in .softirqentry.text section so it needs to be added in the OTHER_TEXT_SECTIONS in modpost.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> --- arch/powerpc/include/asm/book3s/64/kup.h | 2 +- arch/powerpc/include/asm/bug.h | 51 +++++++++++++++++++----- arch/powerpc/include/asm/extable.h | 14 +++++++ arch/powerpc/include/asm/ppc_asm.h | 11 +---- arch/powerpc/kernel/entry_64.S | 2 +- arch/powerpc/kernel/exceptions-64e.S | 2 +- arch/powerpc/kernel/misc_32.S | 2 +- arch/powerpc/kernel/traps.c | 9 ++++- scripts/mod/modpost.c | 2 +- 9 files changed, 69 insertions(+), 26 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/kup.h b/arch/powerpc/include/asm/book3s/64/kup.h index 9700da3a4093..a22839cba32e 100644 --- a/arch/powerpc/include/asm/book3s/64/kup.h +++ b/arch/powerpc/include/asm/book3s/64/kup.h @@ -90,7 +90,7 @@ /* Prevent access to userspace using any key values */ LOAD_REG_IMMEDIATE(\gpr2, AMR_KUAP_BLOCKED) 999: tdne \gpr1, \gpr2 - EMIT_BUG_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE) + EMIT_WARN_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE) END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_BOOK3S_KUAP, 67) #endif .endm diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h index 101dea4eec8d..d92afdbd4449 100644 --- a/arch/powerpc/include/asm/bug.h +++ b/arch/powerpc/include/asm/bug.h @@ -4,6 +4,7 @@ #ifdef __KERNEL__ #include <asm/asm-compat.h> +#include <asm/extable.h> #ifdef CONFIG_BUG @@ -30,6 +31,11 @@ .endm #endif /* verbose */ +.macro EMIT_WARN_ENTRY addr,file,line,flags + EX_TABLE(\addr,\addr+4) + EMIT_BUG_ENTRY \addr,\file,\line,\flags +.endm + #else /* !__ASSEMBLY__ */ /* _EMIT_BUG_ENTRY expects args %0,%1,%2,%3 to be FILE, LINE, flags and sizeof(struct bug_entry), respectively */ @@ -58,6 +64,16 @@ "i" (sizeof(struct bug_entry)), \ ##__VA_ARGS__) +#define WARN_ENTRY(insn, flags, label, ...) \ + asm_volatile_goto( \ + "1: " insn "\n" \ + EX_TABLE(1b, %l[label]) \ + _EMIT_BUG_ENTRY \ + : : "i" (__FILE__), "i" (__LINE__), \ + "i" (flags), \ + "i" (sizeof(struct bug_entry)), \ + ##__VA_ARGS__ : : label) + /* * BUG_ON() and WARN_ON() do their best to cooperate with compile-time * optimisations. However depending on the complexity of the condition @@ -70,7 +86,15 @@ } while (0) #define HAVE_ARCH_BUG -#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags)) +#define __WARN_FLAGS(flags) do { \ + __label__ __label_warn_on; \ + \ + WARN_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags), __label_warn_on); \ + unreachable(); \ + \ +__label_warn_on: \ + break; \ +} while (0) #ifdef CONFIG_PPC64 #define BUG_ON(x) do { \ @@ -83,15 +107,24 @@ } while (0) #define WARN_ON(x) ({ \ - int __ret_warn_on = !!(x); \ - if (__builtin_constant_p(__ret_warn_on)) { \ - if (__ret_warn_on) \ + bool __ret_warn_on = false; \ + do { \ + if (__builtin_constant_p((x))) { \ + if (!(x)) \ + break; \ __WARN(); \ - } else { \ - BUG_ENTRY(PPC_TLNEI " %4, 0", \ - BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN), \ - "r" (__ret_warn_on)); \ - } \ + __ret_warn_on = true; \ + } else { \ + __label__ __label_warn_on; \ + \ + WARN_ENTRY(PPC_TLNEI " %4, 0", \ + BUGFLAG_WARNING | BUGFLAG_TAINT(TAINT_WARN), \ + __label_warn_on, "r" (x)); \ + break; \ +__label_warn_on: \ + __ret_warn_on = true; \ + } \ + } while (0); \ unlikely(__ret_warn_on); \ }) diff --git a/arch/powerpc/include/asm/extable.h b/arch/powerpc/include/asm/extable.h index eb91b2d2935a..26ce2e5c0fa8 100644 --- a/arch/powerpc/include/asm/extable.h +++ b/arch/powerpc/include/asm/extable.h @@ -17,6 +17,8 @@ #define ARCH_HAS_RELATIVE_EXTABLE +#ifndef __ASSEMBLY__ + struct exception_table_entry { int insn; int fixup; @@ -28,3 +30,15 @@ static inline unsigned long extable_fixup(const struct exception_table_entry *x) } #endif + +/* + * Helper macro for exception table entries + */ +#define EX_TABLE(_fault, _target) \ + stringify_in_c(.section __ex_table,"a";)\ + stringify_in_c(.balign 4;) \ + stringify_in_c(.long (_fault) - . ;) \ + stringify_in_c(.long (_target) - . ;) \ + stringify_in_c(.previous) + +#endif diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 8998122fc7e2..a74e1561535a 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -10,6 +10,7 @@ #include <asm/ppc-opcode.h> #include <asm/firmware.h> #include <asm/feature-fixups.h> +#include <asm/extable.h> #ifdef __ASSEMBLY__ @@ -772,16 +773,6 @@ END_FTR_SECTION_NESTED(CPU_FTR_CELL_TB_BUG, CPU_FTR_CELL_TB_BUG, 96) #endif /* __ASSEMBLY__ */ -/* - * Helper macro for exception table entries - */ -#define EX_TABLE(_fault, _target) \ - stringify_in_c(.section __ex_table,"a";)\ - stringify_in_c(.balign 4;) \ - stringify_in_c(.long (_fault) - . ;) \ - stringify_in_c(.long (_target) - . ;) \ - stringify_in_c(.previous) - #ifdef CONFIG_PPC_FSL_BOOK3E #define BTB_FLUSH(reg) \ lis reg,BUCSR_INIT@h; \ diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 6c4d9e276c4d..faa64cc90f02 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -835,7 +835,7 @@ _GLOBAL(enter_rtas) */ lbz r0,PACAIRQSOFTMASK(r13) 1: tdeqi r0,IRQS_ENABLED - EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING + EMIT_WARN_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING #endif /* Hard-disable interrupts */ diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index e8eb9992a270..3f8c51390a04 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -1249,7 +1249,7 @@ fast_exception_return: /* The interrupt should not have soft enabled. */ lbz r7,PACAIRQSOFTMASK(r13) 1: tdeqi r7,IRQS_ENABLED - EMIT_BUG_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING + EMIT_WARN_ENTRY 1b,__FILE__,__LINE__,BUGFLAG_WARNING #endif b fast_exception_return diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S index 6a076bef2932..21390f3119a9 100644 --- a/arch/powerpc/kernel/misc_32.S +++ b/arch/powerpc/kernel/misc_32.S @@ -237,7 +237,7 @@ _GLOBAL(copy_page) addi r3,r3,-4 0: twnei r5, 0 /* WARN if r3 is not cache aligned */ - EMIT_BUG_ENTRY 0b,__FILE__,__LINE__, BUGFLAG_WARNING + EMIT_WARN_ENTRY 0b,__FILE__,__LINE__, BUGFLAG_WARNING addi r4,r4,-4 diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index efba99870691..ee40a637313d 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -1467,8 +1467,13 @@ static void do_program_check(struct pt_regs *regs) if (!(regs->msr & MSR_PR) && /* not user-mode */ report_bug(bugaddr, regs) == BUG_TRAP_TYPE_WARN) { - regs->nip += 4; - return; + const struct exception_table_entry *entry; + + entry = search_exception_tables(bugaddr); + if (entry) { + regs->nip = extable_fixup(entry) + regs->nip - bugaddr; + return; + } } _exception(SIGTRAP, regs, TRAP_BRKPT, regs->nip); return; diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c index 24725e50c7b4..34745f239208 100644 --- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -926,7 +926,7 @@ static void check_section(const char *modname, struct elf_info *elf, ".kprobes.text", ".cpuidle.text", ".noinstr.text" #define OTHER_TEXT_SECTIONS ".ref.text", ".head.text", ".spinlock.text", \ ".fixup", ".entry.text", ".exception.text", ".text.*", \ - ".coldtext" + ".coldtext .softirqentry.text" #define INIT_SECTIONS ".init.*" #define MEM_INIT_SECTIONS ".meminit.*" -- 2.25.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto 2021-04-12 16:26 ` [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto Christophe Leroy @ 2021-04-12 20:40 ` kernel test robot 0 siblings, 0 replies; 5+ messages in thread From: kernel test robot @ 2021-04-12 20:40 UTC (permalink / raw) To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman Cc: clang-built-linux, kbuild-all, linuxppc-dev, linux-kernel [-- Attachment #1: Type: text/plain, Size: 2874 bytes --] Hi Christophe, I love your patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on kbuild/for-next v5.12-rc7 next-20210412] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Christophe-Leroy/powerpc-bug-Remove-specific-powerpc-BUG_ON-and-WARN_ON-on-PPC32/20210413-002741 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-randconfig-r035-20210412 (attached as .config) compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 9829f5e6b1bca9b61efc629770d28bb9014dec45) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install powerpc cross compiling tool for clang build # apt-get install binutils-powerpc-linux-gnu # https://github.com/0day-ci/linux/commit/398ea05a716b58d231d62d20083a101488d155b4 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Christophe-Leroy/powerpc-bug-Remove-specific-powerpc-BUG_ON-and-WARN_ON-on-PPC32/20210413-002741 git checkout 398ea05a716b58d231d62d20083a101488d155b4 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=powerpc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): arch/powerpc/kernel/misc_32.S: Assembler messages: >> arch/powerpc/kernel/misc_32.S:240: Error: unrecognized opcode: `emit_warn_entry' clang-13: error: assembler command failed with exit code 1 (use -v to see invocation) vim +240 arch/powerpc/kernel/misc_32.S 218 219 /* 220 * Copy a whole page. We use the dcbz instruction on the destination 221 * to reduce memory traffic (it eliminates the unnecessary reads of 222 * the destination into cache). This requires that the destination 223 * is cacheable. 224 */ 225 #define COPY_16_BYTES \ 226 lwz r6,4(r4); \ 227 lwz r7,8(r4); \ 228 lwz r8,12(r4); \ 229 lwzu r9,16(r4); \ 230 stw r6,4(r3); \ 231 stw r7,8(r3); \ 232 stw r8,12(r3); \ 233 stwu r9,16(r3) 234 235 _GLOBAL(copy_page) 236 rlwinm r5, r3, 0, L1_CACHE_BYTES - 1 237 addi r3,r3,-4 238 239 0: twnei r5, 0 /* WARN if r3 is not cache aligned */ > 240 EMIT_WARN_ENTRY 0b,__FILE__,__LINE__, BUGFLAG_WARNING 241 242 addi r4,r4,-4 243 244 li r5,4 245 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 27424 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 2021-04-12 16:26 [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy 2021-04-12 16:26 ` [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto Christophe Leroy @ 2021-06-25 14:41 ` Christophe Leroy 2021-08-04 5:25 ` Christophe Leroy 1 sibling, 1 reply; 5+ messages in thread From: Christophe Leroy @ 2021-06-25 14:41 UTC (permalink / raw) To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman Cc: linuxppc-dev, linux-kernel Hi Michael, What happened to this series ? It has been flagged 'under review' in Patchwork since mid April but I never saw it in next-test. Thanks Christophe Le 12/04/2021 à 18:26, Christophe Leroy a écrit : > powerpc BUG_ON() and WARN_ON() are based on using twnei instruction. > > For catching simple conditions like a variable having value 0, this > is efficient because it does the test and the trap at the same time. > But most conditions used with BUG_ON or WARN_ON are more complex and > forces GCC to format the condition into a 0 or 1 value in a register. > This will usually require 2 to 3 instructions. > > The most efficient solution would be to use __builtin_trap() because > GCC is able to optimise the use of the different trap instructions > based on the requested condition, but this is complex if not > impossible for the following reasons: > - __builtin_trap() is a non-recoverable instruction, so it can't be > used for WARN_ON > - Knowing which line of code generated the trap would require the > analysis of DWARF information. This is not a feature we have today. > > As mentioned in commit 8d4fbcfbe0a4 ("Fix WARN_ON() on bitfield ops") > the way WARN_ON() is implemented is suboptimal. That commit also > mentions an issue with 'long long' condition. It fixed it for > WARN_ON() but the same problem still exists today with BUG_ON() on > PPC32. It will be fixed by using the generic implementation. > > By using the generic implementation, gcc will naturally generate a > branch to the unconditional trap generated by BUG(). > > As modern powerpc implement zero-cycle branch, > that's even more efficient. > > And for the functions using WARN_ON() and its return, the test > on return from WARN_ON() is now also used for the WARN_ON() itself. > > On PPC64 we don't want it because we want to be able to use CFAR > register to track how we entered the code that trapped. The CFAR > register would be clobbered by the branch. > > A simple test function: > > unsigned long test9w(unsigned long a, unsigned long b) > { > if (WARN_ON(!b)) > return 0; > return a / b; > } > > Before the patch: > > 0000046c <test9w>: > 46c: 7c 89 00 34 cntlzw r9,r4 > 470: 55 29 d9 7e rlwinm r9,r9,27,5,31 > 474: 0f 09 00 00 twnei r9,0 > 478: 2c 04 00 00 cmpwi r4,0 > 47c: 41 82 00 0c beq 488 <test9w+0x1c> > 480: 7c 63 23 96 divwu r3,r3,r4 > 484: 4e 80 00 20 blr > > 488: 38 60 00 00 li r3,0 > 48c: 4e 80 00 20 blr > > After the patch: > > 00000468 <test9w>: > 468: 2c 04 00 00 cmpwi r4,0 > 46c: 41 82 00 0c beq 478 <test9w+0x10> > 470: 7c 63 23 96 divwu r3,r3,r4 > 474: 4e 80 00 20 blr > > 478: 0f e0 00 00 twui r0,0 > 47c: 38 60 00 00 li r3,0 > 480: 4e 80 00 20 blr > > So we see before the patch we need 3 instructions on the likely path > to handle the WARN_ON(). With the patch the trap goes on the unlikely > path. > > See below the difference at the entry of system_call_exception where > we have several BUG_ON(), allthough less impressing. > > With the patch: > > 00000000 <system_call_exception>: > 0: 81 6a 00 84 lwz r11,132(r10) > 4: 90 6a 00 88 stw r3,136(r10) > 8: 71 60 00 02 andi. r0,r11,2 > c: 41 82 00 70 beq 7c <system_call_exception+0x7c> > 10: 71 60 40 00 andi. r0,r11,16384 > 14: 41 82 00 6c beq 80 <system_call_exception+0x80> > 18: 71 6b 80 00 andi. r11,r11,32768 > 1c: 41 82 00 68 beq 84 <system_call_exception+0x84> > 20: 94 21 ff e0 stwu r1,-32(r1) > 24: 93 e1 00 1c stw r31,28(r1) > 28: 7d 8c 42 e6 mftb r12 > ... > 7c: 0f e0 00 00 twui r0,0 > 80: 0f e0 00 00 twui r0,0 > 84: 0f e0 00 00 twui r0,0 > > Without the patch: > > 00000000 <system_call_exception>: > 0: 94 21 ff e0 stwu r1,-32(r1) > 4: 93 e1 00 1c stw r31,28(r1) > 8: 90 6a 00 88 stw r3,136(r10) > c: 81 6a 00 84 lwz r11,132(r10) > 10: 69 60 00 02 xori r0,r11,2 > 14: 54 00 ff fe rlwinm r0,r0,31,31,31 > 18: 0f 00 00 00 twnei r0,0 > 1c: 69 60 40 00 xori r0,r11,16384 > 20: 54 00 97 fe rlwinm r0,r0,18,31,31 > 24: 0f 00 00 00 twnei r0,0 > 28: 69 6b 80 00 xori r11,r11,32768 > 2c: 55 6b 8f fe rlwinm r11,r11,17,31,31 > 30: 0f 0b 00 00 twnei r11,0 > 34: 7d 8c 42 e6 mftb r12 > > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> > --- > arch/powerpc/include/asm/bug.h | 9 ++++++--- > 1 file changed, 6 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h > index d1635ffbb179..101dea4eec8d 100644 > --- a/arch/powerpc/include/asm/bug.h > +++ b/arch/powerpc/include/asm/bug.h > @@ -68,7 +68,11 @@ > BUG_ENTRY("twi 31, 0, 0", 0); \ > unreachable(); \ > } while (0) > +#define HAVE_ARCH_BUG > + > +#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags)) > > +#ifdef CONFIG_PPC64 > #define BUG_ON(x) do { \ > if (__builtin_constant_p(x)) { \ > if (x) \ > @@ -78,8 +82,6 @@ > } \ > } while (0) > > -#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags)) > - > #define WARN_ON(x) ({ \ > int __ret_warn_on = !!(x); \ > if (__builtin_constant_p(__ret_warn_on)) { \ > @@ -93,9 +95,10 @@ > unlikely(__ret_warn_on); \ > }) > > -#define HAVE_ARCH_BUG > #define HAVE_ARCH_BUG_ON > #define HAVE_ARCH_WARN_ON > +#endif > + > #endif /* __ASSEMBLY __ */ > #else > #ifdef __ASSEMBLY__ > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 2021-06-25 14:41 ` [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy @ 2021-08-04 5:25 ` Christophe Leroy 0 siblings, 0 replies; 5+ messages in thread From: Christophe Leroy @ 2021-08-04 5:25 UTC (permalink / raw) To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman Cc: linuxppc-dev, linux-kernel Gentle ping Michael ? Le 25/06/2021 à 16:41, Christophe Leroy a écrit : > Hi Michael, > > What happened to this series ? It has been flagged 'under review' in Patchwork since mid April but I > never saw it in next-test. > > Thanks > Christophe > > Le 12/04/2021 à 18:26, Christophe Leroy a écrit : >> powerpc BUG_ON() and WARN_ON() are based on using twnei instruction. >> >> For catching simple conditions like a variable having value 0, this >> is efficient because it does the test and the trap at the same time. >> But most conditions used with BUG_ON or WARN_ON are more complex and >> forces GCC to format the condition into a 0 or 1 value in a register. >> This will usually require 2 to 3 instructions. >> >> The most efficient solution would be to use __builtin_trap() because >> GCC is able to optimise the use of the different trap instructions >> based on the requested condition, but this is complex if not >> impossible for the following reasons: >> - __builtin_trap() is a non-recoverable instruction, so it can't be >> used for WARN_ON >> - Knowing which line of code generated the trap would require the >> analysis of DWARF information. This is not a feature we have today. >> >> As mentioned in commit 8d4fbcfbe0a4 ("Fix WARN_ON() on bitfield ops") >> the way WARN_ON() is implemented is suboptimal. That commit also >> mentions an issue with 'long long' condition. It fixed it for >> WARN_ON() but the same problem still exists today with BUG_ON() on >> PPC32. It will be fixed by using the generic implementation. >> >> By using the generic implementation, gcc will naturally generate a >> branch to the unconditional trap generated by BUG(). >> >> As modern powerpc implement zero-cycle branch, >> that's even more efficient. >> >> And for the functions using WARN_ON() and its return, the test >> on return from WARN_ON() is now also used for the WARN_ON() itself. >> >> On PPC64 we don't want it because we want to be able to use CFAR >> register to track how we entered the code that trapped. The CFAR >> register would be clobbered by the branch. >> >> A simple test function: >> >> unsigned long test9w(unsigned long a, unsigned long b) >> { >> if (WARN_ON(!b)) >> return 0; >> return a / b; >> } >> >> Before the patch: >> >> 0000046c <test9w>: >> 46c: 7c 89 00 34 cntlzw r9,r4 >> 470: 55 29 d9 7e rlwinm r9,r9,27,5,31 >> 474: 0f 09 00 00 twnei r9,0 >> 478: 2c 04 00 00 cmpwi r4,0 >> 47c: 41 82 00 0c beq 488 <test9w+0x1c> >> 480: 7c 63 23 96 divwu r3,r3,r4 >> 484: 4e 80 00 20 blr >> >> 488: 38 60 00 00 li r3,0 >> 48c: 4e 80 00 20 blr >> >> After the patch: >> >> 00000468 <test9w>: >> 468: 2c 04 00 00 cmpwi r4,0 >> 46c: 41 82 00 0c beq 478 <test9w+0x10> >> 470: 7c 63 23 96 divwu r3,r3,r4 >> 474: 4e 80 00 20 blr >> >> 478: 0f e0 00 00 twui r0,0 >> 47c: 38 60 00 00 li r3,0 >> 480: 4e 80 00 20 blr >> >> So we see before the patch we need 3 instructions on the likely path >> to handle the WARN_ON(). With the patch the trap goes on the unlikely >> path. >> >> See below the difference at the entry of system_call_exception where >> we have several BUG_ON(), allthough less impressing. >> >> With the patch: >> >> 00000000 <system_call_exception>: >> 0: 81 6a 00 84 lwz r11,132(r10) >> 4: 90 6a 00 88 stw r3,136(r10) >> 8: 71 60 00 02 andi. r0,r11,2 >> c: 41 82 00 70 beq 7c <system_call_exception+0x7c> >> 10: 71 60 40 00 andi. r0,r11,16384 >> 14: 41 82 00 6c beq 80 <system_call_exception+0x80> >> 18: 71 6b 80 00 andi. r11,r11,32768 >> 1c: 41 82 00 68 beq 84 <system_call_exception+0x84> >> 20: 94 21 ff e0 stwu r1,-32(r1) >> 24: 93 e1 00 1c stw r31,28(r1) >> 28: 7d 8c 42 e6 mftb r12 >> ... >> 7c: 0f e0 00 00 twui r0,0 >> 80: 0f e0 00 00 twui r0,0 >> 84: 0f e0 00 00 twui r0,0 >> >> Without the patch: >> >> 00000000 <system_call_exception>: >> 0: 94 21 ff e0 stwu r1,-32(r1) >> 4: 93 e1 00 1c stw r31,28(r1) >> 8: 90 6a 00 88 stw r3,136(r10) >> c: 81 6a 00 84 lwz r11,132(r10) >> 10: 69 60 00 02 xori r0,r11,2 >> 14: 54 00 ff fe rlwinm r0,r0,31,31,31 >> 18: 0f 00 00 00 twnei r0,0 >> 1c: 69 60 40 00 xori r0,r11,16384 >> 20: 54 00 97 fe rlwinm r0,r0,18,31,31 >> 24: 0f 00 00 00 twnei r0,0 >> 28: 69 6b 80 00 xori r11,r11,32768 >> 2c: 55 6b 8f fe rlwinm r11,r11,17,31,31 >> 30: 0f 0b 00 00 twnei r11,0 >> 34: 7d 8c 42 e6 mftb r12 >> >> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> >> --- >> arch/powerpc/include/asm/bug.h | 9 ++++++--- >> 1 file changed, 6 insertions(+), 3 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/bug.h b/arch/powerpc/include/asm/bug.h >> index d1635ffbb179..101dea4eec8d 100644 >> --- a/arch/powerpc/include/asm/bug.h >> +++ b/arch/powerpc/include/asm/bug.h >> @@ -68,7 +68,11 @@ >> BUG_ENTRY("twi 31, 0, 0", 0); \ >> unreachable(); \ >> } while (0) >> +#define HAVE_ARCH_BUG >> + >> +#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags)) >> +#ifdef CONFIG_PPC64 >> #define BUG_ON(x) do { \ >> if (__builtin_constant_p(x)) { \ >> if (x) \ >> @@ -78,8 +82,6 @@ >> } \ >> } while (0) >> -#define __WARN_FLAGS(flags) BUG_ENTRY("twi 31, 0, 0", BUGFLAG_WARNING | (flags)) >> - >> #define WARN_ON(x) ({ \ >> int __ret_warn_on = !!(x); \ >> if (__builtin_constant_p(__ret_warn_on)) { \ >> @@ -93,9 +95,10 @@ >> unlikely(__ret_warn_on); \ >> }) >> -#define HAVE_ARCH_BUG >> #define HAVE_ARCH_BUG_ON >> #define HAVE_ARCH_WARN_ON >> +#endif >> + >> #endif /* __ASSEMBLY __ */ >> #else >> #ifdef __ASSEMBLY__ >> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-08-04 5:25 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-04-12 16:26 [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy 2021-04-12 16:26 ` [PATCH 2/2] powerpc/bug: Provide better flexibility to WARN_ON/__WARN_FLAGS() with asm goto Christophe Leroy 2021-04-12 20:40 ` kernel test robot 2021-06-25 14:41 ` [PATCH 1/2] powerpc/bug: Remove specific powerpc BUG_ON() and WARN_ON() on PPC32 Christophe Leroy 2021-08-04 5:25 ` Christophe Leroy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).