* [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS
@ 2019-04-09 21:25 Rasmus Villemoes
  2019-04-09 21:25 ` [PATCH 07/10] dynamic_debug: add asm-generic implementation for DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Rasmus Villemoes @ 2019-04-09 21:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: x86, Rasmus Villemoes, linux-kernel, Will Deacon, Jason Baron,
	Ingo Molnar, linuxppc-dev, linux-arm-kernel
Similar to CONFIG_GENERIC_BUG_RELATIVE_POINTERS that replaces (8 byte)
const char* members by (4 byte) signed offsets from the bug_entry,
this implements the similar thing for struct _ddebug, the descriptors
underlying pr_debug() and friends in a CONFIG_DYNAMIC_DEBUG kernel.
Since struct _ddebug has four string members, we save 16 byte per
instance. The total savings seem to be comparable to what is saved by
an architecture selecting GENERIC_BUG_RELATIVE_POINTERS (see patch 8
for some numbers for a common distro config).
While refreshing these patches, which were orignally just targeted at
x86-64, it occured to me that despite the implementation relying on
inline asm, there's nothing x86 specific about it, and indeed it seems
to work out-of-the-box for ppc64 and arm64 as well, but those have
only been compile-tested.
The first 6 patches are rather pedestrian preparations. The fun stuff
is in patch 7, and the remaining three patches are just the minimal
boilerplate to hook up the config option and asm-generic header on the
three architectures.
Rasmus Villemoes (10):
  linux/device.h: use unique identifier for each struct _ddebug
  linux/net.h: use unique identifier for each struct _ddebug
  linux/printk.h: use unique identifier for each struct _ddebug
  dynamic_debug: introduce accessors for string members of struct
    _ddebug
  dynamic_debug: drop use of bitfields in struct _ddebug
  dynamic_debug: introduce CONFIG_DYNAMIC_DEBUG_RELATIVE_POINTERS
  dynamic_debug: add asm-generic implementation for
    DYNAMIC_DEBUG_RELATIVE_POINTERS
  x86-64: select DYNAMIC_DEBUG_RELATIVE_POINTERS
  arm64: select DYNAMIC_DEBUG_RELATIVE_POINTERS
  powerpc: select DYNAMIC_DEBUG_RELATIVE_POINTERS for PPC64
 arch/arm64/Kconfig                          |   1 +
 arch/arm64/include/asm/Kbuild               |   1 +
 arch/powerpc/Kconfig                        |   1 +
 arch/powerpc/include/asm/Kbuild             |   1 +
 arch/x86/Kconfig                            |   1 +
 arch/x86/entry/vdso/vdso32/vclock_gettime.c |   1 +
 arch/x86/include/asm/Kbuild                 |   1 +
 include/asm-generic/dynamic_debug.h         | 116 ++++++++++++++++++++
 include/linux/device.h                      |   4 +-
 include/linux/dynamic_debug.h               |  26 +++--
 include/linux/jump_label.h                  |   2 +
 include/linux/net.h                         |   4 +-
 include/linux/printk.h                      |   4 +-
 lib/Kconfig.debug                           |   3 +
 lib/dynamic_debug.c                         | 111 ++++++++++++++-----
 15 files changed, 237 insertions(+), 40 deletions(-)
 create mode 100644 include/asm-generic/dynamic_debug.h
-- 
2.20.1
^ permalink raw reply	[flat|nested] 11+ messages in thread
* [PATCH 07/10] dynamic_debug: add asm-generic implementation for DYNAMIC_DEBUG_RELATIVE_POINTERS
  2019-04-09 21:25 [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
@ 2019-04-09 21:25 ` Rasmus Villemoes
  2019-04-09 21:25 ` [PATCH 10/10] powerpc: select DYNAMIC_DEBUG_RELATIVE_POINTERS for PPC64 Rasmus Villemoes
  2019-05-06  6:48 ` [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
  2 siblings, 0 replies; 11+ messages in thread
From: Rasmus Villemoes @ 2019-04-09 21:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: x86, Rasmus Villemoes, linux-kernel, Will Deacon, Jason Baron,
	Ingo Molnar, linuxppc-dev, linux-arm-kernel
A 64 bit architecture can reduce the size of the kernel image by
selecting CONFIG_DYNAMIC_DEBUG_RELATIVE_POINTERS, but it must provide
a proper DEFINE_DYNAMIC_DEBUG_METADATA macro for emitting the struct
_ddebug in assembly. However, since that does not involve any
instructions, this generic implementation should be usable by most if
not all.
It relies on
(1) standard assembly directives that should work on
all architectures
(2) the "i" constraint for an constant, and
(3) %cN emitting the constant operand N without punctuation
and of course the layout of _ddebug being what one expects. I've
succesfully built x86-64, arm64 and ppc64 defconfig +
CONFIG_DYNAMIC_DEBUG + patches to hook up this header, and boot-tested
the x86-64 one.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 include/asm-generic/dynamic_debug.h | 116 ++++++++++++++++++++++++++++
 include/linux/jump_label.h          |   2 +
 2 files changed, 118 insertions(+)
 create mode 100644 include/asm-generic/dynamic_debug.h
diff --git a/include/asm-generic/dynamic_debug.h b/include/asm-generic/dynamic_debug.h
new file mode 100644
index 000000000000..f1dd3019cd98
--- /dev/null
+++ b/include/asm-generic/dynamic_debug.h
@@ -0,0 +1,116 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_DYNAMIC_DEBUG_H
+#define _ASM_GENERIC_DYNAMIC_DEBUG_H
+
+#ifndef _DYNAMIC_DEBUG_H
+#error "don't include asm/dynamic_debug.h directly"
+#endif
+
+#include <linux/build_bug.h>
+#ifdef CONFIG_JUMP_LABEL
+#include <linux/jump_label.h>
+#endif
+
+/*
+ * We need to know the exact layout of struct _ddebug in order to
+ * initialize it in assembly. Check that all members are at expected
+ * offsets - if any of these fail, the arch cannot use this generic
+ * dynamic_debug.h. DYNAMIC_DEBUG_RELATIVE_POINTERS is pointless for
+ * !64BIT, so we expect the static_key to be at an 8-byte boundary
+ * since it contains stuff which is long-aligned.
+ */
+
+static_assert(BITS_PER_LONG == 64);
+static_assert(offsetof(struct _ddebug, modname_disp)  == 0);
+static_assert(offsetof(struct _ddebug, function_disp) == 4);
+static_assert(offsetof(struct _ddebug, filename_disp) == 8);
+static_assert(offsetof(struct _ddebug, format_disp)   == 12);
+static_assert(offsetof(struct _ddebug, flags_lineno)  == 16);
+
+#ifdef CONFIG_JUMP_LABEL
+static_assert(offsetof(struct _ddebug, key)        == 24);
+static_assert(offsetof(struct static_key, enabled) == 0);
+static_assert(offsetof(struct static_key, type)    == 8);
+
+/* <4 bytes padding>, .enabled, <4 bytes padding>, .type */
+# ifdef DEBUG
+# define _DPRINTK_ASM_KEY_INIT \
+	"\t.int 0\n" "\t.int 1\n" "\t.int 0\n" "\t.quad "__stringify(__JUMP_TYPE_TRUE)"\n"
+# else
+# define _DPRINTK_ASM_KEY_INIT \
+	"\t.int 0\n" "\t.int 0\n" "\t.int 0\n" "\t.quad "__stringify(__JUMP_TYPE_FALSE)"\n"
+# endif
+#else /* !CONFIG_JUMP_LABEL */
+#define _DPRINTK_ASM_KEY_INIT ""
+#endif
+
+/*
+ * There's a bit of magic involved here.
+ *
+ * First, unlike the bug table entries, we need to define an object in
+ * assembly which we can reference from C code (for use by the
+ * DYNAMIC_DEBUG_BRANCH macro), but we don't want 'name' to have
+ * external linkage (that would require use of globally unique
+ * identifiers, which we can't guarantee). Fortunately, the "extern"
+ * keyword just tells the compiler that _somebody_ provides that
+ * symbol - usually that somebody is the linker, but in this case it's
+ * the assembler, and since we do not do .globl name, the symbol gets
+ * internal linkage.
+ *
+ * So far so good. The next problem is that there's no scope in
+ * assembly, so the identifier 'name' has to be unique within each
+ * translation unit - otherwise all uses of that identifier end up
+ * referring to the same struct _ddebug instance. pr_debug and friends
+ * do this by use of indirection and __UNIQUE_ID(), and new users of
+ * DEFINE_DYNAMIC_DEBUG_METADATA() should do something similar. We
+ * need to catch cases where this is not done at build time.
+ *
+ * With assembly-level .ifndef we can ensure that we only define a
+ * given identifier once, preventing "symbol 'foo' already defined"
+ * errors. But we still need to detect and fail on multiple uses of
+ * the same identifer. The simplest, and wrong, solution to that is to
+ * add an .else .error branch to the .ifndef. The problem is that just
+ * because the DEFINE_DYNAMIC_DEBUG_METADATA() macro is only expanded
+ * once with a given identifier, the compiler may emit the assembly
+ * code multiple times, e.g. if the macro appears in an inline
+ * function. Now, in a normal case like
+ *
+ *   static inline get_next_id(void) { static int v; return ++v; }
+ *
+ * all inlined copies of get_next_id are _supposed_ to refer to the
+ * same object 'v'. So we do need to allow this chunk of assembly to
+ * appear multiple times with the same 'name', as long as they all
+ * came from the same DEFINE_DYNAMIC_DEBUG_METADATA() instance. To do
+ * that, we pass __COUNTER__ to the asm(), and set an assembler symbol
+ * name.ddebug.once to that value when we first define 'name'. When we
+ * meet a second attempt at defining 'name', we compare
+ * name.ddebug.once to %6 and error out if they are different.
+ */
+
+#define DEFINE_DYNAMIC_DEBUG_METADATA(name, fmt)			\
+	extern struct _ddebug name;					\
+	asm volatile(".ifndef " __stringify(name) "\n"			\
+		     ".pushsection __verbose,\"aw\"\n"			\
+		     ".type "__stringify(name)", STT_OBJECT\n"		\
+		     ".size "__stringify(name)", %c5\n"			\
+		     "1:\n"						\
+		     __stringify(name) ":\n"				\
+		     "\t.int %c0 - 1b /* _ddebug::modname_disp */\n"	\
+		     "\t.int %c1 - 1b /* _ddebug::function_disp */\n"	\
+		     "\t.int %c2 - 1b /* _ddebug::filename_disp */\n"	\
+		     "\t.int %c3 - 1b /* _ddebug::format_disp */\n"	\
+		     "\t.int %c4      /* _ddebug::flags_lineno */\n"	\
+		     _DPRINTK_ASM_KEY_INIT				\
+		     "\t.org 1b+%c5\n"					\
+		     ".popsection\n"					\
+		     ".set "__stringify(name)".ddebug.once, %c6\n"	\
+		     ".elseif "__stringify(name)".ddebug.once - %c6\n"	\
+		     ".line "__stringify(__LINE__) " - 1\n"             \
+		     ".error \"'"__stringify(name)"' used as _ddebug identifier more than once\"\n" \
+		     ".endif\n"						\
+		     : : "i" (KBUILD_MODNAME), "i" (__func__),		\
+		       "i" (__FILE__), "i" (fmt),			\
+		       "i" (_DPRINTK_FLAGS_LINENO_INIT),		\
+		       "i" (sizeof(struct _ddebug)), "i" (__COUNTER__))
+
+#endif /* _ASM_GENERIC_DYNAMIC_DEBUG_H */
diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index 3e113a1fa0f1..2b027d2ef3d0 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -190,6 +190,8 @@ struct module;
 
 #ifdef CONFIG_JUMP_LABEL
 
+#define __JUMP_TYPE_FALSE	0
+#define __JUMP_TYPE_TRUE	1
 #define JUMP_TYPE_FALSE		0UL
 #define JUMP_TYPE_TRUE		1UL
 #define JUMP_TYPE_LINKED	2UL
-- 
2.20.1
^ permalink raw reply related	[flat|nested] 11+ messages in thread
* [PATCH 10/10] powerpc: select DYNAMIC_DEBUG_RELATIVE_POINTERS for PPC64
  2019-04-09 21:25 [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
  2019-04-09 21:25 ` [PATCH 07/10] dynamic_debug: add asm-generic implementation for DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
@ 2019-04-09 21:25 ` Rasmus Villemoes
  2019-04-23 15:37   ` Christophe Leroy
  2019-05-06  6:48 ` [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
  2 siblings, 1 reply; 11+ messages in thread
From: Rasmus Villemoes @ 2019-04-09 21:25 UTC (permalink / raw)
  To: Andrew Morton, linuxppc-dev; +Cc: Jason Baron, Rasmus Villemoes, linux-kernel
Similar to GENERIC_BUG_RELATIVE_POINTERS, one can now relativize the
four const char* members of struct _ddebug, thus saving 16 bytes per
instance (one for each pr_debug(), dev_debug() etc. in a
CONFIG_DYNAMIC_DEBUG kernel). The asm-generic implementation seems to
work out-of-the-box, though this is only compile-tested.
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
---
 arch/powerpc/Kconfig            | 1 +
 arch/powerpc/include/asm/Kbuild | 1 +
 2 files changed, 2 insertions(+)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 2d0be82c3061..6821c8ae1d62 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -155,6 +155,7 @@ config PPC
 	select BUILDTIME_EXTABLE_SORT
 	select CLONE_BACKWARDS
 	select DCACHE_WORD_ACCESS		if PPC64 && CPU_LITTLE_ENDIAN
+	select DYNAMIC_DEBUG_RELATIVE_POINTERS	if PPC64
 	select DYNAMIC_FTRACE			if FUNCTION_TRACER
 	select EDAC_ATOMIC_SCRUB
 	select EDAC_SUPPORT
diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild
index a0c132bedfae..f332e202192a 100644
--- a/arch/powerpc/include/asm/Kbuild
+++ b/arch/powerpc/include/asm/Kbuild
@@ -3,6 +3,7 @@ generated-y += syscall_table_64.h
 generated-y += syscall_table_c32.h
 generated-y += syscall_table_spu.h
 generic-y += div64.h
+generic-y += dynamic_debug.h
 generic-y += export.h
 generic-y += irq_regs.h
 generic-y += local64.h
-- 
2.20.1
^ permalink raw reply related	[flat|nested] 11+ messages in thread
* Re: [PATCH 10/10] powerpc: select DYNAMIC_DEBUG_RELATIVE_POINTERS for PPC64
  2019-04-09 21:25 ` [PATCH 10/10] powerpc: select DYNAMIC_DEBUG_RELATIVE_POINTERS for PPC64 Rasmus Villemoes
@ 2019-04-23 15:37   ` Christophe Leroy
  2019-04-23 19:36     ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Christophe Leroy @ 2019-04-23 15:37 UTC (permalink / raw)
  To: Rasmus Villemoes, Andrew Morton, linuxppc-dev; +Cc: Jason Baron, linux-kernel
Le 09/04/2019 à 23:25, Rasmus Villemoes a écrit :
> Similar to GENERIC_BUG_RELATIVE_POINTERS, one can now relativize the
> four const char* members of struct _ddebug, thus saving 16 bytes per
> instance (one for each pr_debug(), dev_debug() etc. in a
> CONFIG_DYNAMIC_DEBUG kernel). The asm-generic implementation seems to
> work out-of-the-box, though this is only compile-tested.
> 
> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> ---
>   arch/powerpc/Kconfig            | 1 +
>   arch/powerpc/include/asm/Kbuild | 1 +
>   2 files changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 2d0be82c3061..6821c8ae1d62 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -155,6 +155,7 @@ config PPC
>   	select BUILDTIME_EXTABLE_SORT
>   	select CLONE_BACKWARDS
>   	select DCACHE_WORD_ACCESS		if PPC64 && CPU_LITTLE_ENDIAN
> +	select DYNAMIC_DEBUG_RELATIVE_POINTERS	if PPC64
Why only PPC64 ? Won't it work with PPC32 ? Or would it be 
counter-performant on 32 bits ?
Christophe
>   	select DYNAMIC_FTRACE			if FUNCTION_TRACER
>   	select EDAC_ATOMIC_SCRUB
>   	select EDAC_SUPPORT
> diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild
> index a0c132bedfae..f332e202192a 100644
> --- a/arch/powerpc/include/asm/Kbuild
> +++ b/arch/powerpc/include/asm/Kbuild
> @@ -3,6 +3,7 @@ generated-y += syscall_table_64.h
>   generated-y += syscall_table_c32.h
>   generated-y += syscall_table_spu.h
>   generic-y += div64.h
> +generic-y += dynamic_debug.h
>   generic-y += export.h
>   generic-y += irq_regs.h
>   generic-y += local64.h
> 
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH 10/10] powerpc: select DYNAMIC_DEBUG_RELATIVE_POINTERS for PPC64
  2019-04-23 15:37   ` Christophe Leroy
@ 2019-04-23 19:36     ` Andrew Morton
  2019-04-24  6:46       ` Rasmus Villemoes
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2019-04-23 19:36 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: linuxppc-dev, Jason Baron, Rasmus Villemoes, linux-kernel
On Tue, 23 Apr 2019 17:37:33 +0200 Christophe Leroy <christophe.leroy@c-s.fr> wrote:
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -155,6 +155,7 @@ config PPC
> >   	select BUILDTIME_EXTABLE_SORT
> >   	select CLONE_BACKWARDS
> >   	select DCACHE_WORD_ACCESS		if PPC64 && CPU_LITTLE_ENDIAN
> > +	select DYNAMIC_DEBUG_RELATIVE_POINTERS	if PPC64
> 
> Why only PPC64 ? Won't it work with PPC32 ? Or would it be 
> counter-performant on 32 bits ?
Ditto arm and i386?
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH 10/10] powerpc: select DYNAMIC_DEBUG_RELATIVE_POINTERS for PPC64
  2019-04-23 19:36     ` Andrew Morton
@ 2019-04-24  6:46       ` Rasmus Villemoes
  0 siblings, 0 replies; 11+ messages in thread
From: Rasmus Villemoes @ 2019-04-24  6:46 UTC (permalink / raw)
  To: Andrew Morton, Christophe Leroy; +Cc: Jason Baron, linuxppc-dev, linux-kernel
On 23/04/2019 21.36, Andrew Morton wrote:
> On Tue, 23 Apr 2019 17:37:33 +0200 Christophe Leroy <christophe.leroy@c-s.fr> wrote:
> 
>>> --- a/arch/powerpc/Kconfig
>>> +++ b/arch/powerpc/Kconfig
>>> @@ -155,6 +155,7 @@ config PPC
>>>   	select BUILDTIME_EXTABLE_SORT
>>>   	select CLONE_BACKWARDS
>>>   	select DCACHE_WORD_ACCESS		if PPC64 && CPU_LITTLE_ENDIAN
>>> +	select DYNAMIC_DEBUG_RELATIVE_POINTERS	if PPC64
>>
>> Why only PPC64 ? Won't it work with PPC32 ? Or would it be 
>> counter-performant on 32 bits ?
> 
> Ditto arm and i386?
> 
It's pointless on 32-bit platforms - I'm replacing absolute const char*
pointers with a relative s32 offset from the _ddebug descriptor, so if
sizeof(void*)==4 there's no gain.
And yes, the current implementation also wouldn't work out-of-the-box
for 32-bit platforms, since the asm needs to know how to properly
initialize a whole struct _ddebug, which (often) contains a static_key,
which in turn contains a pointer member, which both affects its size as
well as its placement inside _ddebug. The C code in dynamic_debug.c
would likely Just Work, but there's no point in complicating the asm
part for no gain, so there are static_assert()s in place to ensure
BITS_PER_LONG==64 (as well as checking the various offsetof()s etc.).
[I don't think performance matters at all, it's one extra addition to
access these fields, and that is only done in the rare cases where
someone interacts with the dynamic_debug/control sysfs file, and when
one of the activated pr_debug()s is actually hit (so a few extra
instructions drown in the printk overhead).]
I do now see that PPC64 does not select GENERIC_BUG_RELATIVE_POINTERS,
so maybe this scheme simply doesn't work on PPC64, or nobody has done
the work to reduce the sizeof(struct bug_entry) on PPC64? As I said,
I've only compile-tested arm64 and ppc64.
Rasmus
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS
  2019-04-09 21:25 [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
  2019-04-09 21:25 ` [PATCH 07/10] dynamic_debug: add asm-generic implementation for DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
  2019-04-09 21:25 ` [PATCH 10/10] powerpc: select DYNAMIC_DEBUG_RELATIVE_POINTERS for PPC64 Rasmus Villemoes
@ 2019-05-06  6:48 ` Rasmus Villemoes
  2019-05-06  7:05   ` Ingo Molnar
  2 siblings, 1 reply; 11+ messages in thread
From: Rasmus Villemoes @ 2019-05-06  6:48 UTC (permalink / raw)
  To: Rasmus Villemoes, Andrew Morton
  Cc: Nick Desaulniers, Arnd Bergmann, x86, Will Deacon, linux-kernel,
	Jason Baron, Ingo Molnar, Nathan Chancellor, linuxppc-dev,
	linux-arm-kernel
On 09/04/2019 23.25, Rasmus Villemoes wrote:
> While refreshing these patches, which were orignally just targeted at
> x86-64, it occured to me that despite the implementation relying on
> inline asm, there's nothing x86 specific about it, and indeed it seems
> to work out-of-the-box for ppc64 and arm64 as well, but those have
> only been compile-tested.
So, apart from the Clang build failures for non-x86, I now also got a
report that gcc 4.8 miscompiles this stuff in some cases [1], even for
x86 - gcc 4.9 does not seem to have the problem. So, given that the 5.2
merge window just opened, I suppose this is the point where I should
pull the plug on this experiment :(
Rasmus
[1] Specifically, the problem manifested in net/ipv4/tcp_input.c: Both
uses of the static inline inet_csk_clear_xmit_timer() pass a
compile-time constant 'what', so the ifs get folded away and both uses
are completely inlined. Yet, gcc still decides to emit a copy of the
final 'else' branch of inet_csk_clear_xmit_timer() as its own
inet_csk_reset_xmit_timer.part.55 function, which is of course unused.
And despite the asm() that defines the ddebug descriptor being an "asm
volatile", gcc thinks it's fine to elide that (the code path is
unreachable, after all....), so the entire asm for that function is
        .section        .text.unlikely
        .type   inet_csk_reset_xmit_timer.part.55, @function
inet_csk_reset_xmit_timer.part.55:
        movq    $.LC1, %rsi     #,
        movq    $__UNIQUE_ID_ddebug160, %rdi    #,
        xorl    %eax, %eax      #
        jmp     __dynamic_pr_debug      #
        .size   inet_csk_reset_xmit_timer.part.55,
.-inet_csk_reset_xmit_timer.part.55
which of course fails to link since the symbol __UNIQUE_ID_ddebug160 is
nowhere defined.
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS
  2019-05-06  6:48 ` [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
@ 2019-05-06  7:05   ` Ingo Molnar
  2019-05-06  7:34     ` Rasmus Villemoes
  0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2019-05-06  7:05 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Nick Desaulniers, Arnd Bergmann, x86, Will Deacon, linux-kernel,
	Linus Torvalds, Jason Baron, Ingo Molnar, Andy Lutomirski,
	Andrew Morton, linuxppc-dev, Nathan Chancellor, linux-arm-kernel
* Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> On 09/04/2019 23.25, Rasmus Villemoes wrote:
> 
> > While refreshing these patches, which were orignally just targeted at
> > x86-64, it occured to me that despite the implementation relying on
> > inline asm, there's nothing x86 specific about it, and indeed it seems
> > to work out-of-the-box for ppc64 and arm64 as well, but those have
> > only been compile-tested.
> 
> So, apart from the Clang build failures for non-x86, I now also got a
> report that gcc 4.8 miscompiles this stuff in some cases [1], even for
> x86 - gcc 4.9 does not seem to have the problem. So, given that the 5.2
> merge window just opened, I suppose this is the point where I should
> pull the plug on this experiment :(
> 
> Rasmus
> 
> [1] Specifically, the problem manifested in net/ipv4/tcp_input.c: Both
> uses of the static inline inet_csk_clear_xmit_timer() pass a
> compile-time constant 'what', so the ifs get folded away and both uses
> are completely inlined. Yet, gcc still decides to emit a copy of the
> final 'else' branch of inet_csk_clear_xmit_timer() as its own
> inet_csk_reset_xmit_timer.part.55 function, which is of course unused.
> And despite the asm() that defines the ddebug descriptor being an "asm
> volatile", gcc thinks it's fine to elide that (the code path is
> unreachable, after all....), so the entire asm for that function is
> 
>         .section        .text.unlikely
>         .type   inet_csk_reset_xmit_timer.part.55, @function
> inet_csk_reset_xmit_timer.part.55:
>         movq    $.LC1, %rsi     #,
>         movq    $__UNIQUE_ID_ddebug160, %rdi    #,
>         xorl    %eax, %eax      #
>         jmp     __dynamic_pr_debug      #
>         .size   inet_csk_reset_xmit_timer.part.55,
> .-inet_csk_reset_xmit_timer.part.55
> 
> which of course fails to link since the symbol __UNIQUE_ID_ddebug160 is
> nowhere defined.
It's sad to see such nice data footprint savings go the way of the dodo 
just because GCC 4.8 is buggy.
The current compatibility cut-off is GCC 4.6:
  GNU C                  4.6              gcc --version
Do we know where the GCC bug was fixed, was it in GCC 4.9?
According to the GCC release dates:
  https://www.gnu.org/software/gcc/releases.html
4.8.0 was released in early-2013, while 4.9.0 was released in early-2014. 
So the tooling compatibility window for latest upstream would narrow from 
~6 years to ~5 years.
Thanks,
	Ingo
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS
  2019-05-06  7:05   ` Ingo Molnar
@ 2019-05-06  7:34     ` Rasmus Villemoes
  2019-05-06  7:48       ` Ingo Molnar
  2019-05-06 14:48       ` Segher Boessenkool
  0 siblings, 2 replies; 11+ messages in thread
From: Rasmus Villemoes @ 2019-05-06  7:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Nick Desaulniers, Arnd Bergmann, x86, Will Deacon, linux-kernel,
	Linus Torvalds, Jason Baron, Ingo Molnar, Andy Lutomirski,
	Andrew Morton, linuxppc-dev, Nathan Chancellor, linux-arm-kernel
On 06/05/2019 09.05, Ingo Molnar wrote:
> 
> 
> It's sad to see such nice data footprint savings go the way of the dodo 
> just because GCC 4.8 is buggy.
> 
> The current compatibility cut-off is GCC 4.6:
> 
>   GNU C                  4.6              gcc --version
> 
> Do we know where the GCC bug was fixed, was it in GCC 4.9?
Not sure. The report was from a build on CentOS with gcc 4.8.5, so I
tried installing the gcc-4.8 package on my Ubuntu machine and could
reproduce. Then I tried installed gcc-4.9, and after disabling
CONFIG_RETPOLINE (both CentOS and Ubuntu carry backported retpoline
support in their 4.8, but apparently not 4.9), I could see that the
problem was gone. But whether it's gone because it no longer elides an
asm volatile() on a code path it otherwise emits code for, or because it
simply doesn't emit the unused static inline() at all I don't know.
I thought 0day also tested a range of supported compiler versions, so I
was rather surprised at getting this report at all. But I suppose the
arch/config matrix is already pretty huge. Anyway, the bug certainly
doesn't exist in any of the gcc versions 0day does test.
I _am_ bending the C rules a bit with the "extern some_var; asm
volatile(".section some_section\nsome_var: blabla");". I should probably
ask on the gcc list whether this way of defining a local symbol in
inline assembly and referring to it from C is supposed to work, or it
just happens to work by chance.
Rasmus
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS
  2019-05-06  7:34     ` Rasmus Villemoes
@ 2019-05-06  7:48       ` Ingo Molnar
  2019-05-06 14:48       ` Segher Boessenkool
  1 sibling, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2019-05-06  7:48 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Nick Desaulniers, Arnd Bergmann, x86, Will Deacon, linux-kernel,
	Linus Torvalds, Jason Baron, Ingo Molnar, Andy Lutomirski,
	Andrew Morton, linuxppc-dev, Nathan Chancellor, linux-arm-kernel
* Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote:
> I _am_ bending the C rules a bit with the "extern some_var; asm 
> volatile(".section some_section\nsome_var: blabla");". I should 
> probably ask on the gcc list whether this way of defining a local 
> symbol in inline assembly and referring to it from C is supposed to 
> work, or it just happens to work by chance.
Doing that would be rather useful I think.
Thanks,
	Ingo
^ permalink raw reply	[flat|nested] 11+ messages in thread
* Re: [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS
  2019-05-06  7:34     ` Rasmus Villemoes
  2019-05-06  7:48       ` Ingo Molnar
@ 2019-05-06 14:48       ` Segher Boessenkool
  1 sibling, 0 replies; 11+ messages in thread
From: Segher Boessenkool @ 2019-05-06 14:48 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: linuxppc-dev, Arnd Bergmann, Nick Desaulniers, x86, Will Deacon,
	linux-kernel, Nathan Chancellor, Jason Baron, Ingo Molnar,
	Andy Lutomirski, Andrew Morton, Linus Torvalds, Ingo Molnar,
	linux-arm-kernel
On Mon, May 06, 2019 at 09:34:55AM +0200, Rasmus Villemoes wrote:
> I _am_ bending the C rules a bit with the "extern some_var; asm
> volatile(".section some_section\nsome_var: blabla");". I should probably
> ask on the gcc list whether this way of defining a local symbol in
> inline assembly and referring to it from C is supposed to work, or it
> just happens to work by chance.
It only works by chance.  There is no way GCC can know the asm needs
that variable.  If you make it (or its address) an input of the asm it
should work as far as I can see?  (Need exact code to analyse it exactly).
Segher
^ permalink raw reply	[flat|nested] 11+ messages in thread
end of thread, other threads:[~2019-05-06 14:52 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-04-09 21:25 [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
2019-04-09 21:25 ` [PATCH 07/10] dynamic_debug: add asm-generic implementation for DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
2019-04-09 21:25 ` [PATCH 10/10] powerpc: select DYNAMIC_DEBUG_RELATIVE_POINTERS for PPC64 Rasmus Villemoes
2019-04-23 15:37   ` Christophe Leroy
2019-04-23 19:36     ` Andrew Morton
2019-04-24  6:46       ` Rasmus Villemoes
2019-05-06  6:48 ` [PATCH 00/10] implement DYNAMIC_DEBUG_RELATIVE_POINTERS Rasmus Villemoes
2019-05-06  7:05   ` Ingo Molnar
2019-05-06  7:34     ` Rasmus Villemoes
2019-05-06  7:48       ` Ingo Molnar
2019-05-06 14:48       ` Segher Boessenkool
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).