* [PATCH 1/2] tracepoints: Do not trace when cpu is offline
[not found] <20160216194908.005437159@goodmis.org>
@ 2016-02-16 19:49 ` Steven Rostedt
2016-02-16 20:09 ` Mathieu Desnoyers
2016-02-16 19:49 ` [PATCH 2/2] tracing: Fix freak link error caused by branch tracer Steven Rostedt
1 sibling, 1 reply; 4+ messages in thread
From: Steven Rostedt @ 2016-02-16 19:49 UTC (permalink / raw)
To: linux-kernel
Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney,
Mathieu Desnoyers, Denis Kirjanov, stable
[-- Attachment #1: 0001-tracepoints-Do-not-trace-when-cpu-is-offline.patch --]
[-- Type: text/plain, Size: 3297 bytes --]
From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
The tracepoint infrastructure uses RCU sched protection to enable and
disable tracepoints safely. There are some instances where tracepoints are
used in infrastructure code (like kfree()) that get called after a CPU is
going offline, and perhaps when it is coming back online but hasn't been
registered yet.
This can probuce the following warning:
[ INFO: suspicious RCU usage. ]
4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S
-------------------------------
include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1
no locks held by swapper/8/0.
stack backtrace:
CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S 4.4.0-00006-g0fe53e8-dirty #34
Call Trace:
[c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
[c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
[c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
[c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
[c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
[c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
[c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
[c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
[c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
[c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
[c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
[c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14
This warning is not a false positive either. RCU is not protecting code that
is being executed while the CPU is offline.
Instead of playing "whack-a-mole(TM)" and adding conditional statements to
the tracepoints we find that are used in this instance, simply add a
cpu_online() test to the tracepoint code where the tracepoint will be
ignored if the CPU is offline.
Use of raw_smp_processor_id() is fine, as there should never be a case where
the tracepoint code goes from running on a CPU that is online and suddenly
gets migrated to a CPU that is offline.
Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org
Reported-by: Denis Kirjanov <kda@linux-powerpc.org>
Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints")
Cc: stable@vger.kernel.org # v2.6.28+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
include/linux/tracepoint.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index acd522a91539..acfdbf353a0b 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -14,8 +14,10 @@
* See the file COPYING for more details.
*/
+#include <linux/smp.h>
#include <linux/errno.h>
#include <linux/types.h>
+#include <linux/cpumask.h>
#include <linux/rcupdate.h>
#include <linux/tracepoint-defs.h>
@@ -132,6 +134,9 @@ extern void syscall_unregfunc(void);
void *it_func; \
void *__data; \
\
+ if (!cpu_online(raw_smp_processor_id())) \
+ return; \
+ \
if (!(cond)) \
return; \
prercu; \
--
2.6.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH 2/2] tracing: Fix freak link error caused by branch tracer
[not found] <20160216194908.005437159@goodmis.org>
2016-02-16 19:49 ` [PATCH 1/2] tracepoints: Do not trace when cpu is offline Steven Rostedt
@ 2016-02-16 19:49 ` Steven Rostedt
1 sibling, 0 replies; 4+ messages in thread
From: Steven Rostedt @ 2016-02-16 19:49 UTC (permalink / raw)
To: linux-kernel
Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney,
Mathieu Desnoyers, Nicolas Pitre, stable, Arnd Bergmann
[-- Attachment #1: 0002-tracing-Fix-freak-link-error-caused-by-branch-tracer.patch --]
[-- Type: text/plain, Size: 2816 bytes --]
From: Arnd Bergmann <arnd@arndb.de>
In my randconfig tests, I came across a bug that involves several
components:
* gcc-4.9 through at least 5.3
* CONFIG_GCOV_PROFILE_ALL enabling -fprofile-arcs for all files
* CONFIG_PROFILE_ALL_BRANCHES overriding every if()
* The optimized implementation of do_div() that tries to
replace a library call with an division by multiplication
* code in drivers/media/dvb-frontends/zl10353.c doing
u32 adc_clock = 450560; /* 45.056 MHz */
if (state->config.adc_clock)
adc_clock = state->config.adc_clock;
do_div(value, adc_clock);
In this case, gcc fails to determine whether the divisor
in do_div() is __builtin_constant_p(). In particular, it
concludes that __builtin_constant_p(adc_clock) is false, while
__builtin_constant_p(!!adc_clock) is true.
That in turn throws off the logic in do_div() that also uses
__builtin_constant_p(), and instead of picking either the
constant- optimized division, and the code in ilog2() that uses
__builtin_constant_p() to figure out whether it knows the answer at
compile time. The result is a link error from failing to find
multiple symbols that should never have been called based on
the __builtin_constant_p():
dvb-frontends/zl10353.c:138: undefined reference to `____ilog2_NaN'
dvb-frontends/zl10353.c:138: undefined reference to `__aeabi_uldivmod'
ERROR: "____ilog2_NaN" [drivers/media/dvb-frontends/zl10353.ko] undefined!
ERROR: "__aeabi_uldivmod" [drivers/media/dvb-frontends/zl10353.ko] undefined!
This patch avoids the problem by changing __trace_if() to check
whether the condition is known at compile-time to be nonzero, rather
than checking whether it is actually a constant.
I see this one link error in roughly one out of 1600 randconfig builds
on ARM, and the patch fixes all known instances.
Link: http://lkml.kernel.org/r/1455312410-1058841-1-git-send-email-arnd@arndb.de
Acked-by: Nicolas Pitre <nico@linaro.org>
Fixes: ab3c9c686e22 ("branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y")
Cc: stable@vger.kernel.org # v2.6.30+
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
include/linux/compiler.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 00b042c49ccd..48f5aab117ae 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -144,7 +144,7 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
*/
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
#define __trace_if(cond) \
- if (__builtin_constant_p((cond)) ? !!(cond) : \
+ if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
({ \
int ______r; \
static struct ftrace_branch_data \
--
2.6.4
^ permalink raw reply related [flat|nested] 4+ messages in thread