* [PATCH 0/2] [GIT PULL] tracing: A couple of fixes
@ 2016-02-16 19:49 Steven Rostedt
2016-02-16 19:49 ` [PATCH 1/2] tracepoints: Do not trace when cpu is offline Steven Rostedt
2016-02-16 19:49 ` [PATCH 2/2] tracing: Fix freak link error caused by branch tracer Steven Rostedt
0 siblings, 2 replies; 5+ messages in thread
From: Steven Rostedt @ 2016-02-16 19:49 UTC (permalink / raw)
To: linux-kernel
Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney,
Mathieu Desnoyers
Linus,
This includes two fixes.
The first is something that has come up a few times and has been
worked out individually, but it's come up now enough that the problem
should be generic. Tracepoints are protected by RCU sched. There are
several tracepoints within core infrastructure like kfree().
If a tracepoint is called when the CPU is going down, or when it's
coming up but has yet to be recognized by RCU, a RCU warning is
triggered. This is a true bug as that tracepoint is not protected by
RCU. Usually, this is taken care of by testing for cpu online as
a tracepoint condition. But as this is happening more often, moving
it from a individual tracepoint to a check in the tracepoint infrastructure
is more robust.
Note, there is now a duplicate of a cpu online test, because this update
does not remove the individual checks. But the overhead is small enough
that the removal can be done in another release.
The second change is strange linker breakage due to the branch tracer's
builtin_constant_p() check failing, and treating the condition as a
variable instead of a constant. Arnd Bergmann found that this can be
fixed by testing !!(cond) instead of just (cond).
Please pull the latest trace-fixes-v4.5-rc4 tree, which can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-fixes-v4.5-rc4
Tag SHA1: abfeb62bfa1a219b42a0e984db59f6535a741b17
Head SHA1: b33c8ff4431a343561e2319f17c14286f2aa52e2
Arnd Bergmann (1):
tracing: Fix freak link error caused by branch tracer
Steven Rostedt (Red Hat) (1):
tracepoints: Do not trace when cpu is offline
----
include/linux/compiler.h | 2 +-
include/linux/tracepoint.h | 5 +++++
2 files changed, 6 insertions(+), 1 deletion(-)
^ permalink raw reply [flat|nested] 5+ messages in thread* [PATCH 1/2] tracepoints: Do not trace when cpu is offline
2016-02-16 19:49 [PATCH 0/2] [GIT PULL] tracing: A couple of fixes Steven Rostedt
@ 2016-02-16 19:49 ` Steven Rostedt
2016-02-16 20:09 ` Mathieu Desnoyers
2016-02-16 19:49 ` [PATCH 2/2] tracing: Fix freak link error caused by branch tracer Steven Rostedt
1 sibling, 1 reply; 5+ messages in thread
From: Steven Rostedt @ 2016-02-16 19:49 UTC (permalink / raw)
To: linux-kernel
Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney,
Mathieu Desnoyers, Denis Kirjanov, stable
[-- Attachment #1: 0001-tracepoints-Do-not-trace-when-cpu-is-offline.patch --]
[-- Type: text/plain, Size: 3295 bytes --]
From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
The tracepoint infrastructure uses RCU sched protection to enable and
disable tracepoints safely. There are some instances where tracepoints are
used in infrastructure code (like kfree()) that get called after a CPU is
going offline, and perhaps when it is coming back online but hasn't been
registered yet.
This can probuce the following warning:
[ INFO: suspicious RCU usage. ]
4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S
-------------------------------
include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1
no locks held by swapper/8/0.
stack backtrace:
CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S 4.4.0-00006-g0fe53e8-dirty #34
Call Trace:
[c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
[c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
[c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
[c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
[c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
[c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
[c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
[c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
[c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
[c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
[c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
[c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14
This warning is not a false positive either. RCU is not protecting code that
is being executed while the CPU is offline.
Instead of playing "whack-a-mole(TM)" and adding conditional statements to
the tracepoints we find that are used in this instance, simply add a
cpu_online() test to the tracepoint code where the tracepoint will be
ignored if the CPU is offline.
Use of raw_smp_processor_id() is fine, as there should never be a case where
the tracepoint code goes from running on a CPU that is online and suddenly
gets migrated to a CPU that is offline.
Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org
Reported-by: Denis Kirjanov <kda@linux-powerpc.org>
Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints")
Cc: stable@vger.kernel.org # v2.6.28+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
include/linux/tracepoint.h | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index acd522a91539..acfdbf353a0b 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -14,8 +14,10 @@
* See the file COPYING for more details.
*/
+#include <linux/smp.h>
#include <linux/errno.h>
#include <linux/types.h>
+#include <linux/cpumask.h>
#include <linux/rcupdate.h>
#include <linux/tracepoint-defs.h>
@@ -132,6 +134,9 @@ extern void syscall_unregfunc(void);
void *it_func; \
void *__data; \
\
+ if (!cpu_online(raw_smp_processor_id())) \
+ return; \
+ \
if (!(cond)) \
return; \
prercu; \
--
2.6.4
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH 1/2] tracepoints: Do not trace when cpu is offline
2016-02-16 19:49 ` [PATCH 1/2] tracepoints: Do not trace when cpu is offline Steven Rostedt
@ 2016-02-16 20:09 ` Mathieu Desnoyers
2016-02-16 20:32 ` Steven Rostedt
0 siblings, 1 reply; 5+ messages in thread
From: Mathieu Desnoyers @ 2016-02-16 20:09 UTC (permalink / raw)
To: rostedt, Thomas Gleixner
Cc: linux-kernel, Linus Torvalds, Ingo Molnar, Andrew Morton,
Paul E. McKenney, Denis Kirjanov, stable
----- On Feb 16, 2016, at 2:49 PM, rostedt rostedt@goodmis.org wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
>
> The tracepoint infrastructure uses RCU sched protection to enable and
> disable tracepoints safely. There are some instances where tracepoints are
> used in infrastructure code (like kfree()) that get called after a CPU is
> going offline, and perhaps when it is coming back online but hasn't been
> registered yet.
>
> This can probuce the following warning:
>
> [ INFO: suspicious RCU usage. ]
> 4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S
> -------------------------------
> include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage!
>
> other info that might help us debug this:
>
> RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1
> no locks held by swapper/8/0.
>
> stack backtrace:
> CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S
> 4.4.0-00006-g0fe53e8-dirty #34
> Call Trace:
> [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable)
> [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170
> [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440
> [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100
> [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150
> [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140
> [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310
> [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60
> [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40
> [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560
> [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360
> [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14
>
> This warning is not a false positive either. RCU is not protecting code that
> is being executed while the CPU is offline.
>
> Instead of playing "whack-a-mole(TM)" and adding conditional statements to
> the tracepoints we find that are used in this instance, simply add a
> cpu_online() test to the tracepoint code where the tracepoint will be
> ignored if the CPU is offline.
>
> Use of raw_smp_processor_id() is fine, as there should never be a case where
> the tracepoint code goes from running on a CPU that is online and suddenly
> gets migrated to a CPU that is offline.
>
> Link:
> http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org
If I get this right, you are proposing to "hide" events happening
during CPU hot-unplug on dying CPUs from the tracers to fix an issue
caused by interaction of RCU-sched (used for Tracepoint synchronization)
wrt CPU hotplug.
Removing tracing visibility of hot-unplug events seems to be an unwelcome
side-effect. I don't know how far Thomas Gleixner got in his overhaul of
CPU hotplug, but he might have something to say about this, as I believe
he would be the first user concerned.
Thoughts ?
Thanks,
Mathieu
>
> Reported-by: Denis Kirjanov <kda@linux-powerpc.org>
> Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints")
> Cc: stable@vger.kernel.org # v2.6.28+
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
> include/linux/tracepoint.h | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index acd522a91539..acfdbf353a0b 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -14,8 +14,10 @@
> * See the file COPYING for more details.
> */
>
> +#include <linux/smp.h>
> #include <linux/errno.h>
> #include <linux/types.h>
> +#include <linux/cpumask.h>
> #include <linux/rcupdate.h>
> #include <linux/tracepoint-defs.h>
>
> @@ -132,6 +134,9 @@ extern void syscall_unregfunc(void);
> void *it_func; \
> void *__data; \
> \
> + if (!cpu_online(raw_smp_processor_id())) \
> + return; \
> + \
> if (!(cond)) \
> return; \
> prercu; \
> --
> 2.6.4
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH 1/2] tracepoints: Do not trace when cpu is offline
2016-02-16 20:09 ` Mathieu Desnoyers
@ 2016-02-16 20:32 ` Steven Rostedt
0 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2016-02-16 20:32 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Thomas Gleixner, linux-kernel, Linus Torvalds, Ingo Molnar,
Andrew Morton, Paul E. McKenney, Denis Kirjanov, stable
On Tue, 16 Feb 2016 20:09:35 +0000 (UTC)
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> If I get this right, you are proposing to "hide" events happening
> during CPU hot-unplug on dying CPUs from the tracers to fix an issue
> caused by interaction of RCU-sched (used for Tracepoint synchronization)
> wrt CPU hotplug.
>
> Removing tracing visibility of hot-unplug events seems to be an unwelcome
> side-effect. I don't know how far Thomas Gleixner got in his overhaul of
> CPU hotplug, but he might have something to say about this, as I believe
> he would be the first user concerned.
>
Well, trace_printk() still works. But right now you *can't* have a
tracepoint executed on a CPU that is offline, because it is a bug.
Period. That's because we use RCU sched to protect tracepoints. When
the CPU is offline, there is no protection. It is possible that the
tracepoint structures may get corrupted, or worse, crash the system.
Granted, the race is quite small but it is a bug never the less.
Now, if you want tracepoints to be visible for CPUs that are offline,
then we need something else to protect it. But until then, this fixes
the issue.
And before this patch, we've been adding conditional tracepoints to
check "if (cpu_online(raw_smp_processor_id()))" when a warning appeared.
This patch gets rid of the need to keep adding these whack-a-mole
patches.
-- Steve
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] tracing: Fix freak link error caused by branch tracer
2016-02-16 19:49 [PATCH 0/2] [GIT PULL] tracing: A couple of fixes Steven Rostedt
2016-02-16 19:49 ` [PATCH 1/2] tracepoints: Do not trace when cpu is offline Steven Rostedt
@ 2016-02-16 19:49 ` Steven Rostedt
1 sibling, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2016-02-16 19:49 UTC (permalink / raw)
To: linux-kernel
Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Paul E. McKenney,
Mathieu Desnoyers, Nicolas Pitre, stable, Arnd Bergmann
[-- Attachment #1: 0002-tracing-Fix-freak-link-error-caused-by-branch-tracer.patch --]
[-- Type: text/plain, Size: 2816 bytes --]
From: Arnd Bergmann <arnd@arndb.de>
In my randconfig tests, I came across a bug that involves several
components:
* gcc-4.9 through at least 5.3
* CONFIG_GCOV_PROFILE_ALL enabling -fprofile-arcs for all files
* CONFIG_PROFILE_ALL_BRANCHES overriding every if()
* The optimized implementation of do_div() that tries to
replace a library call with an division by multiplication
* code in drivers/media/dvb-frontends/zl10353.c doing
u32 adc_clock = 450560; /* 45.056 MHz */
if (state->config.adc_clock)
adc_clock = state->config.adc_clock;
do_div(value, adc_clock);
In this case, gcc fails to determine whether the divisor
in do_div() is __builtin_constant_p(). In particular, it
concludes that __builtin_constant_p(adc_clock) is false, while
__builtin_constant_p(!!adc_clock) is true.
That in turn throws off the logic in do_div() that also uses
__builtin_constant_p(), and instead of picking either the
constant- optimized division, and the code in ilog2() that uses
__builtin_constant_p() to figure out whether it knows the answer at
compile time. The result is a link error from failing to find
multiple symbols that should never have been called based on
the __builtin_constant_p():
dvb-frontends/zl10353.c:138: undefined reference to `____ilog2_NaN'
dvb-frontends/zl10353.c:138: undefined reference to `__aeabi_uldivmod'
ERROR: "____ilog2_NaN" [drivers/media/dvb-frontends/zl10353.ko] undefined!
ERROR: "__aeabi_uldivmod" [drivers/media/dvb-frontends/zl10353.ko] undefined!
This patch avoids the problem by changing __trace_if() to check
whether the condition is known at compile-time to be nonzero, rather
than checking whether it is actually a constant.
I see this one link error in roughly one out of 1600 randconfig builds
on ARM, and the patch fixes all known instances.
Link: http://lkml.kernel.org/r/1455312410-1058841-1-git-send-email-arnd@arndb.de
Acked-by: Nicolas Pitre <nico@linaro.org>
Fixes: ab3c9c686e22 ("branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y")
Cc: stable@vger.kernel.org # v2.6.30+
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
---
include/linux/compiler.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 00b042c49ccd..48f5aab117ae 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -144,7 +144,7 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect);
*/
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
#define __trace_if(cond) \
- if (__builtin_constant_p((cond)) ? !!(cond) : \
+ if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
({ \
int ______r; \
static struct ftrace_branch_data \
--
2.6.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-02-16 20:32 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-16 19:49 [PATCH 0/2] [GIT PULL] tracing: A couple of fixes Steven Rostedt
2016-02-16 19:49 ` [PATCH 1/2] tracepoints: Do not trace when cpu is offline Steven Rostedt
2016-02-16 20:09 ` Mathieu Desnoyers
2016-02-16 20:32 ` Steven Rostedt
2016-02-16 19:49 ` [PATCH 2/2] tracing: Fix freak link error caused by branch tracer Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox