From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756033AbcBPUJp (ORCPT ); Tue, 16 Feb 2016 15:09:45 -0500 Received: from mail.efficios.com ([78.47.125.74]:35174 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756001AbcBPUJm (ORCPT ); Tue, 16 Feb 2016 15:09:42 -0500 Date: Tue, 16 Feb 2016 20:09:35 +0000 (UTC) From: Mathieu Desnoyers To: rostedt , Thomas Gleixner Cc: linux-kernel@vger.kernel.org, Linus Torvalds , Ingo Molnar , Andrew Morton , "Paul E. McKenney" , Denis Kirjanov , stable@vger.kernel.org Message-ID: <740376947.21497.1455653375751.JavaMail.zimbra@efficios.com> In-Reply-To: <20160216194947.184748136@goodmis.org> References: <20160216194908.005437159@goodmis.org> <20160216194947.184748136@goodmis.org> Subject: Re: [PATCH 1/2] tracepoints: Do not trace when cpu is offline MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [78.47.125.74] X-Mailer: Zimbra 8.6.0_GA_1178 (ZimbraWebClient - FF44 (Linux)/8.6.0_GA_1178) Thread-Topic: tracepoints: Do not trace when cpu is offline Thread-Index: dBD0eUVm2HqM0v1Q0unx/LFRC0mXYg== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Feb 16, 2016, at 2:49 PM, rostedt rostedt@goodmis.org wrote: > From: "Steven Rostedt (Red Hat)" > > The tracepoint infrastructure uses RCU sched protection to enable and > disable tracepoints safely. There are some instances where tracepoints are > used in infrastructure code (like kfree()) that get called after a CPU is > going offline, and perhaps when it is coming back online but hasn't been > registered yet. > > This can probuce the following warning: > > [ INFO: suspicious RCU usage. ] > 4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S > ------------------------------- > include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage! > > other info that might help us debug this: > > RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 > no locks held by swapper/8/0. > > stack backtrace: > CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S > 4.4.0-00006-g0fe53e8-dirty #34 > Call Trace: > [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable) > [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170 > [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440 > [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100 > [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150 > [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140 > [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310 > [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60 > [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40 > [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560 > [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360 > [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14 > > This warning is not a false positive either. RCU is not protecting code that > is being executed while the CPU is offline. > > Instead of playing "whack-a-mole(TM)" and adding conditional statements to > the tracepoints we find that are used in this instance, simply add a > cpu_online() test to the tracepoint code where the tracepoint will be > ignored if the CPU is offline. > > Use of raw_smp_processor_id() is fine, as there should never be a case where > the tracepoint code goes from running on a CPU that is online and suddenly > gets migrated to a CPU that is offline. > > Link: > http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.org If I get this right, you are proposing to "hide" events happening during CPU hot-unplug on dying CPUs from the tracers to fix an issue caused by interaction of RCU-sched (used for Tracepoint synchronization) wrt CPU hotplug. Removing tracing visibility of hot-unplug events seems to be an unwelcome side-effect. I don't know how far Thomas Gleixner got in his overhaul of CPU hotplug, but he might have something to say about this, as I believe he would be the first user concerned. Thoughts ? Thanks, Mathieu > > Reported-by: Denis Kirjanov > Fixes: 97e1c18e8d17b ("tracing: Kernel Tracepoints") > Cc: stable@vger.kernel.org # v2.6.28+ > Signed-off-by: Steven Rostedt > --- > include/linux/tracepoint.h | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h > index acd522a91539..acfdbf353a0b 100644 > --- a/include/linux/tracepoint.h > +++ b/include/linux/tracepoint.h > @@ -14,8 +14,10 @@ > * See the file COPYING for more details. > */ > > +#include > #include > #include > +#include > #include > #include > > @@ -132,6 +134,9 @@ extern void syscall_unregfunc(void); > void *it_func; \ > void *__data; \ > \ > + if (!cpu_online(raw_smp_processor_id())) \ > + return; \ > + \ > if (!(cond)) \ > return; \ > prercu; \ > -- > 2.6.4 -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com