public inbox for linux-next@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] tracing/tlb/x85: Fix splat of calling RCU trace code on offline CPU
@ 2015-02-06 20:06 Steven Rostedt
  2015-02-06 20:06 ` [PATCH 1/2] tracing: Add condition check to RCU lockdep checks Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 44+ messages in thread
From: Steven Rostedt @ 2015-02-06 20:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Paul E. McKenney, Dave Hansen, Rafael J. Wysocki, linux-next,
	Stephen Rothwell, Kristen Carlson Accardi, H. Peter Anvin,
	Rik van Riel, Mel Gorman, Andrew Morton

Paul,

I found a much better fix than adding the rcu_nocheck(). Simply have the
rcu check inside the condition check as well. This way the rcu splat
will only happen if the condition is set too. The condition doesn't need
the tracepoint enabled.

Now I'm thinking that I should push the first patch through my tree as it
only touches tracing. The second patch you can freely take.

Neither patch really depends on the other, but both patches are required
to make the splat go away. If Sedat could test these patches together,
and give his tested-by tag, that would be great. I'll run my patch through
my full series of tests and then push to linux next. You could take the second
patch and push that through your tree (linux-next). When both arrive, the
bug will be fixed. The two do not need to come in together.

Thoughts?

-- Steve


Steven Rostedt (Red Hat) (2):
      tracing: Add condition check to RCU lockdep checks
      x86/tbl/trace: Do not trace on CPU that is offline

----
 include/linux/tracepoint.h | 2 +-
 include/trace/events/tlb.h | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread
* Re: [PATCH 2/2] x86/tbl/trace: Do not trace on CPU that is offline
@ 2015-02-06 20:11 Sedat Dilek
  2015-02-06 20:21 ` Steven Rostedt
  2015-02-06 21:34 ` Steven Rostedt
  0 siblings, 2 replies; 44+ messages in thread
From: Sedat Dilek @ 2015-02-06 20:11 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Paul E. McKenney, Dave Hansen, Rafael J. Wysocki,
	linux-next, Stephen Rothwell, Kristen Carlson Accardi,
	H. Peter Anvin, Rik van Riel, Mel Gorman, Andrew Morton

On Fri, Feb 6, 2015 at 9:06 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> From: "Steven Rostedt (Red Hat)" <rostedt@goodmis.org>
>

Subject: x86/tbl/trace: ---> .../tlb/...

- Sedat -

> When taking a CPU down for suspend and resume, a tracepoint may be called
> when the CPU has been designated offline. As tracepoints require RCU for
> protection, they must not be called if the current CPU is offline.
>
> Unfortunately, trace_tlb_flush() is called in this scenario as was noted
> by LOCKDEP:
>
> ...
>
>  Disabling non-boot CPUs ...
>  intel_pstate CPU 1 exiting
>
>  ===============================
>  smpboot: CPU 1 didn't die...
>  [ INFO: suspicious RCU usage. ]
>  3.19.0-rc7-next-20150204.1-iniza-small #1 Not tainted
>  -------------------------------
>  include/trace/events/tlb.h:35 suspicious rcu_dereference_check() usage!
>
>  other info that might help us debug this:
>
>  RCU used illegally from offline CPU!
>  rcu_scheduler_active = 1, debug_locks = 0
>  no locks held by swapper/1/0.
>
>  stack backtrace:
>  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.19.0-rc7-next-20150204.1-iniza-small #1
>  Hardware name: SAMSUNG ELECTRONICS CO., LTD. 530U3BI/530U4BI/530U4BH/530U3BI/530U4BI/530U4BH, BIOS 13XK 03/28/2013
>   0000000000000001 ffff88011a44fe18 ffffffff817e370d 0000000000000011
>   ffff88011a448290 ffff88011a44fe48 ffffffff810d6847 ffff8800c66b9600
>   0000000000000001 ffff88011a44c000 ffffffff81cb3900 ffff88011a44fe78
>  Call Trace:
>   [<ffffffff817e370d>] dump_stack+0x4c/0x65
>   [<ffffffff810d6847>] lockdep_rcu_suspicious+0xe7/0x120
>   [<ffffffff810b71a5>] idle_task_exit+0x205/0x2c0
>   [<ffffffff81054c4e>] play_dead_common+0xe/0x50
>   [<ffffffff81054ca5>] native_play_dead+0x15/0x140
>   [<ffffffff8102963f>] arch_cpu_idle_dead+0xf/0x20
>   [<ffffffff810cd89e>] cpu_startup_entry+0x37e/0x580
>   [<ffffffff81053e20>] start_secondary+0x140/0x150
>  intel_pstate CPU 2 exiting
>
> ...
>
> By converting the tlb_flush tracepoint to a TRACE_EVENT_CONDITION where the
> condition is cpu_online(smp_processor_id()), we can avoid calling RCU protected
> code when the CPU is offline.
>
> Link: http://lkml.kernel.org/r/CA+icZUUGiGDoL5NU8RuxKzFjoLjEKRtUWx=JB8B9a0EQv-eGzQ@mail.gmail.com
>
> Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
> Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
> ---
>  include/trace/events/tlb.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/include/trace/events/tlb.h b/include/trace/events/tlb.h
> index 13391d288107..0e7635765153 100644
> --- a/include/trace/events/tlb.h
> +++ b/include/trace/events/tlb.h
> @@ -13,11 +13,13 @@
>         { TLB_LOCAL_SHOOTDOWN,          "local shootdown" },            \
>         { TLB_LOCAL_MM_SHOOTDOWN,       "local mm shootdown" }
>
> -TRACE_EVENT(tlb_flush,
> +TRACE_EVENT_CONDITION(tlb_flush,
>
>         TP_PROTO(int reason, unsigned long pages),
>         TP_ARGS(reason, pages),
>
> +       TP_CONDITION(cpu_online(smp_processor_id())),
> +
>         TP_STRUCT__entry(
>                 __field(          int, reason)
>                 __field(unsigned long,  pages)
> --
> 2.1.4
>
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2015-02-07 23:48 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-06 20:06 [PATCH 0/2] tracing/tlb/x85: Fix splat of calling RCU trace code on offline CPU Steven Rostedt
2015-02-06 20:06 ` [PATCH 1/2] tracing: Add condition check to RCU lockdep checks Steven Rostedt
2015-02-06 20:11   ` Steven Rostedt
2015-02-06 20:13   ` [PATCH 1/2 v2] " Steven Rostedt
2015-02-06 20:06 ` [PATCH 2/2] x86/tbl/trace: Do not trace on CPU that is offline Steven Rostedt
2015-02-06 23:27   ` Paul E. McKenney
2015-02-07  4:02     ` Steven Rostedt
2015-02-07  8:01       ` Sedat Dilek
2015-02-07 15:20         ` Steven Rostedt
2015-02-07 19:50           ` Sedat Dilek
2015-02-07 20:09           ` Paul E. McKenney
2015-02-07 20:14             ` Sedat Dilek
2015-02-07 21:52             ` Steven Rostedt
2015-02-07 22:14               ` Paul E. McKenney
2015-02-07 23:01                 ` Sedat Dilek
2015-02-07 23:48               ` Dave Hansen
2015-02-07  8:13       ` Sedat Dilek
2015-02-07 15:22         ` Steven Rostedt
2015-02-06 21:07 ` [PATCH 0/2] tracing/tlb/x85: Fix splat of calling RCU trace code on offline CPU Sedat Dilek
2015-02-06 21:18   ` Steven Rostedt
2015-02-06 21:19     ` Steven Rostedt
2015-02-06 21:23       ` Sedat Dilek
2015-02-06 21:27         ` Steven Rostedt
2015-02-06 21:24       ` Steven Rostedt
2015-02-06 21:29         ` Sedat Dilek
2015-02-06 21:21     ` Sedat Dilek
2015-02-06 21:28       ` Steven Rostedt
2015-02-06 21:33       ` Steven Rostedt
2015-02-06 21:38   ` Paul E. McKenney
2015-02-06 22:13     ` Sedat Dilek
2015-02-06 22:35       ` Steven Rostedt
2015-02-06 22:48         ` Paul E. McKenney
2015-02-06 22:51           ` Sedat Dilek
2015-02-06 23:02             ` Sedat Dilek
2015-02-06 23:04             ` Paul E. McKenney
2015-02-06 23:04           ` Steven Rostedt
  -- strict thread matches above, loose matches on Subject: below --
2015-02-06 20:11 [PATCH 2/2] x86/tbl/trace: Do not trace on CPU that is offline Sedat Dilek
2015-02-06 20:21 ` Steven Rostedt
2015-02-06 20:23   ` Sedat Dilek
2015-02-06 20:26     ` Steven Rostedt
2015-02-06 21:34 ` Steven Rostedt
2015-02-06 21:39   ` Sedat Dilek
2015-02-06 21:42     ` Steven Rostedt
2015-02-06 22:32       ` Sedat Dilek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox